candle/candle-pyo3 at 1e26d539d9f9574222e8d049fdbfadfa09e3ce2e - candle - Gitea: Git with a cup of tea

huggingface/candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 02:38:10 +00:00

Files

History

Nicolas Patry 403680f17d Quantized GGUF style (#1523 )

* Metal quantized modifications proposal.

- Add a device param, wherever needed.
- Create new QMetal storage thing that implements QuantizedType.
- Update everywhere needed.

Fix Python.

Fixing examples.

Fix: fmt + clippy + stub.

Moving everything around.

Only missing the actual implems.

Fixing everything + adding dequantized kernels.

More work.

Fixing matmul.

Fmt + Clippy

Some clippy fixes.

Working state.

Q2K Metal -> Bugged (also present in GGML).
Q4K CPU -> Bugged (present previously, new test catch it).
Q5K CPU -> Bugged (present previously).
Q8_1 Both -> Never really implemented it seems
Q8K metal -> Never implemented in metal

Fixing Q2K bug (present in ggml).

* Cleanup.

* Fix the rebase.

* Removing the fences speeds everything up and *is* correct this time...

* Cleanup the fence.

* After rebase.

* Bad code removal.

* Rebase after phi2 merge + fix replit default to CPU.

* Making the CI happy.

* More happy tests.

---------

Co-authored-by: Nicolas Patry <nicolas@Nicolass-MacBook-Pro.local>

2024-01-17 10:27:58 +01:00

..

_additional_typing

PyO3: Add equal and __richcmp__ to candle.Tensor (#1099 )

2023-10-30 15:17:28 +00:00

Quantized GGUF style (#1523 )

2024-01-17 10:27:58 +01:00

Quantized GGUF style (#1523 )

2024-01-17 10:27:58 +01:00

Fix a couple typos (#1451 )

2023-12-17 05:20:05 -06:00

.gitignore

Make the Python Wrapper more Hackable and simplify Quantization (#1010 )

2023-10-06 19:01:07 +01:00

build.rs

Fix the pyo3 build for macos. (#324 )

2023-08-05 14:53:57 +01:00

Cargo.toml

Simplifying our internal cargo dependencies. (#1529 )

2024-01-07 12:04:14 +01:00

e5.py

Use an attention mask in the e5 padding case. (#1085 )

2023-10-13 18:53:40 +01:00

pyproject.toml

Make the Python Wrapper more Hackable and simplify Quantization (#1010 )

2023-10-06 19:01:07 +01:00

quant-llama.py

Make the Python Wrapper more Hackable and simplify Quantization (#1010 )

2023-10-06 19:01:07 +01:00

README.md

Generate *.pyi stubs for PyO3 wrapper (#870 )

2023-09-16 17:23:38 +01:00

stub.py

Fix a couple typos (#1451 )

2023-12-17 05:20:05 -06:00

test_pytorch.py

convert pytorch's tensor in Python API (#1172 )

2023-10-25 19:39:14 +01:00

test.py

Add support for accelerate in the pyo3 bindings. (#1167 )

2023-10-24 06:34:37 +01:00

README.md

Installation

From the candle-pyo3 directory, enable a virtual env where you will want the candle package to be installed then run.

maturin develop -r 
python test.py

Generating Stub Files for Type Hinting

For type hinting support, the candle-pyo3 package requires *.pyi files. You can automatically generate these files using the stub.py script.

Steps:

Install the package using maturin.
Generate the stub files by running:
```
python stub.py
```

Validation:

To ensure that the stub files match the current implementation, execute:

python stub.py --check