candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 18:48:51 +00:00

Author	SHA1	Message	Date
Laurent Mazare	deee7612da	Quantized version of mistral. (#1009 ) * Quantized version of mistral. * Integrate the quantized mistral variant. * Use the quantized weight files. * Tweak the quantization command. * Fix the dtype when computing the rotary embeddings. * Update the readme with the quantized version. * Fix the decoding of the remaining tokens.	2023-09-30 18:25:47 +01:00
Laurent Mazare	06207332bc	Streaming mode for reporting the generated tokens (#1007 ) * Token streaming. * Use the token output stream. * Flush the output. * Ensure that the last characters get reported.	2023-09-30 15:04:11 +01:00
Laurent Mazare	4021272875	Use flash-attn for mistral. (#1004 )	2023-09-30 12:15:10 +01:00
Laurent Mazare	87e3a4e175	Mistral: exit on eos token. (#1001 ) * Mistral: exit on eos token. * Print the proper stats. * Also add a short flag.	2023-09-30 07:07:06 +01:00
Laurent Mazare	6f17ef82be	Mistral: print the generated text. (#992 )	2023-09-29 10:56:11 +01:00
Laurent Mazare	ada8851a23	Add the mistral example. (#984 ) * Add the mistral example. * Use the two model files. * Adjust the dtype. * Tweak the weight paths. * Remove the end of text token. * Get the mistral model to generate some text.	2023-09-28 16:19:18 +01:00