mirror of https://github.com/huggingface/candle.git synced 2025-06-15 10:26:33 +00:00

Files

Lucien Thomas 3d05f5cf3d Qwen3 quantized implementation (#2939 )

* fixed quantized_phi3 implementation

* quantized_qwen3 implementation

* Update quantized_phi3.rs

* Update quantized_phi3.rs

* add quantized_qwen3 example

* Clippy fixes.

* Cleanup.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>

2025-05-08 15:06:10 +02:00

380 B

Raw Blame History

candle-quantized-qwen3

Qwen3 is an upgraded version of Qwen2.5, released by Alibaba Cloud.

Running the example

cargo run --example quantized-qwen3 --release -- --prompt "Write a function to count prime numbers up to N."

0.6b is used by default, 1.7b, 4b, 8b, 14b, and 32b models are available via --model argument.

380 B Raw Blame History

candle-quantized-qwen3

Running the example

380 B

Raw Blame History