mirror of
https://github.com/huggingface/candle.git
synced 2025-06-17 11:08:52 +00:00
Quantized version of mistral. (#1009)
* Quantized version of mistral. * Integrate the quantized mistral variant. * Use the quantized weight files. * Tweak the quantization command. * Fix the dtype when computing the rotary embeddings. * Update the readme with the quantized version. * Fix the decoding of the remaining tokens.
This commit is contained in:
@ -7,6 +7,7 @@ pub mod llama;
|
||||
pub mod mistral;
|
||||
pub mod mixformer;
|
||||
pub mod quantized_llama;
|
||||
pub mod quantized_mistral;
|
||||
pub mod quantized_mixformer;
|
||||
pub mod quantized_t5;
|
||||
pub mod segment_anything;
|
||||
|
Reference in New Issue
Block a user