add quantized qwen2 (#2329)

* add quantized version of qwen2 and corresponding example for qwen2-instruct * fix quantized qwen2 clippy error
2025-06-20 12:06:35 +00:00 · 2024-07-12 16:00:03 +08:00
parent a226a9736b
commit c63048d374
4 changed files with 641 additions and 0 deletions
--- a/candle-examples/examples/quantized-qwen2-instruct/README.md
+++ b/candle-examples/examples/quantized-qwen2-instruct/README.md
@ -0,0 +1,11 @@
+# candle-quantized-qwen2-instruct
+
+[Qwen2]((https://qwenlm.github.io/blog/qwen2/)) is an upgraded version of Qwen1.5, released by Alibaba Cloud.
+
+## Running the example
+
+```bash
+cargo run --example quantized-qwen2-instruct --release -- --prompt "Write a function to count prime numbers up to N."
+```
+
+0.5b, 1.5b, 7b and 72b models are available via `--model` argument.