T5 quantized example (#922)

* Load gguf files for the quantized t5.

* Add the quantized t5 example.

* Allow for loading local files.

* Add some support for quantizing safetensor files.

* Transpose before quantizing.

* Quantized t5.

* Retrieve the weights from the hub.
This commit is contained in:
Laurent Mazare
2023-09-21 12:33:15 +01:00
committed by GitHub
parent 2619c4307f
commit 3b557765e8
4 changed files with 272 additions and 1 deletions

View File

@ -0,0 +1,17 @@
# candle-quantized-t5
This example uses a quantized version of the t5 model.
```bash
$ cargo run --example quantized-t5 --release -- --prompt "translate to German: A beautiful candle."
...
Eine schöne Kerze.
```
The weight file is automatically retrieved from the hub. It is also possible to
generate quantized weight files from the original safetensors file by using the
`tensor-tools` command line utility via:
```bash
cargo run --example tensor-tools --release -- quantize --quantization q6k PATH/TO/T5/model.safetensors /tmp/model.gguf
```