Qwen3 quantized implementation (#2939)

* fixed quantized_phi3 implementation

* quantized_qwen3 implementation

* Update quantized_phi3.rs

* Update quantized_phi3.rs

* add quantized_qwen3 example

* Clippy fixes.

* Cleanup.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
This commit is contained in:
Lucien Thomas
2025-05-08 08:06:10 -05:00
committed by GitHub
parent 637473cb5e
commit 3d05f5cf3d
5 changed files with 755 additions and 1 deletions

View File

@ -0,0 +1,11 @@
# candle-quantized-qwen3
[Qwen3]((https://qwenlm.github.io/blog/qwen3/)) is an upgraded version of Qwen2.5, released by Alibaba Cloud.
## Running the example
```bash
cargo run --example quantized-qwen3 --release -- --prompt "Write a function to count prime numbers up to N."
```
0.6b is used by default, 1.7b, 4b, 8b, 14b, and 32b models are available via `--model` argument.