mirror of https://github.com/huggingface/candle.git synced 2025-06-16 18:48:51 +00:00

Files

Laurent Mazare 3b557765e8 T5 quantized example (#922 )

* Load gguf files for the quantized t5.

* Add the quantized t5 example.

* Allow for loading local files.

* Add some support for quantizing safetensor files.

* Transpose before quantizing.

* Quantized t5.

* Retrieve the weights from the hub.

2023-09-21 12:33:15 +01:00

550 B

Raw Blame History

candle-quantized-t5

This example uses a quantized version of the t5 model.

$ cargo run --example quantized-t5 --release -- --prompt "translate to German: A beautiful candle."
...
 Eine schöne Kerze.

The weight file is automatically retrieved from the hub. It is also possible to generate quantized weight files from the original safetensors file by using the tensor-tools command line utility via:

cargo run --example tensor-tools --release -- quantize --quantization q6k PATH/TO/T5/model.safetensors /tmp/model.gguf

550 B Raw Blame History

candle-quantized-t5

550 B

Raw Blame History