mirror of
https://github.com/huggingface/candle.git
synced 2025-06-16 18:48:51 +00:00

* Load gguf files for the quantized t5. * Add the quantized t5 example. * Allow for loading local files. * Add some support for quantizing safetensor files. * Transpose before quantizing. * Quantized t5. * Retrieve the weights from the hub.
550 B
550 B
candle-quantized-t5
This example uses a quantized version of the t5 model.
$ cargo run --example quantized-t5 --release -- --prompt "translate to German: A beautiful candle."
...
Eine schöne Kerze.
The weight file is automatically retrieved from the hub. It is also possible to
generate quantized weight files from the original safetensors file by using the
tensor-tools
command line utility via:
cargo run --example tensor-tools --release -- quantize --quantization q6k PATH/TO/T5/model.safetensors /tmp/model.gguf