candle

huggingface/candle

Fork 0

mirror of https://github.com/huggingface/candle.git synced 2025-06-18 11:37:11 +00:00

Commit Graph

Author	SHA1	Message	Date
Laurent Mazare	deee7612da	Quantized version of mistral. (#1009 ) * Quantized version of mistral. * Integrate the quantized mistral variant. * Use the quantized weight files. * Tweak the quantization command. * Fix the dtype when computing the rotary embeddings. * Update the readme with the quantized version. * Fix the decoding of the remaining tokens.	2023-09-30 18:25:47 +01:00
Laurent Mazare	06207332bc	Streaming mode for reporting the generated tokens (#1007 ) * Token streaming. * Use the token output stream. * Flush the output. * Ensure that the last characters get reported.	2023-09-30 15:04:11 +01:00

Author

SHA1

Message

Date

Laurent Mazare

deee7612da

Quantized version of mistral. (#1009 )

* Quantized version of mistral.

* Integrate the quantized mistral variant.

* Use the quantized weight files.

* Tweak the quantization command.

* Fix the dtype when computing the rotary embeddings.

* Update the readme with the quantized version.

* Fix the decoding of the remaining tokens.

2023-09-30 18:25:47 +01:00

Laurent Mazare

06207332bc

Streaming mode for reporting the generated tokens (#1007 )

* Token streaming.

* Use the token output stream.

* Flush the output.

* Ensure that the last characters get reported.

2023-09-30 15:04:11 +01:00

2 Commits