Separate quantized phi-3 implementation. (#2157)

* Separate quantized phi-3 implementation. * Integrate the quantized phi3 model.= * Small fixes, get the generation to work properly. * Keep the old llama implementation around. * Change the default.
2025-06-16 18:48:51 +00:00 · 2024-05-04 10:14:57 +02:00
parent 59b18d974e
commit b13a82a438
7 changed files with 323 additions and 12 deletions
--- a/candle-transformers/src/models/mod.rs
+++ b/candle-transformers/src/models/mod.rs
@ -40,6 +40,7 @@ pub mod quantized_mixformer;
 pub mod quantized_moondream;
 pub mod quantized_mpt;
 pub mod quantized_phi;
+pub mod quantized_phi3;
 pub mod quantized_recurrent_gemma;
 pub mod quantized_rwkv_v5;
 pub mod quantized_rwkv_v6;