3ad4770eb6
Use cat for faster MQA computation. ( #2043 )
...
* Use cat for faster MQA computation.
* Move the function to utils + use it in mistral.
* Use the shared repeat-kv in a few more models.
* Fix.
2024-04-12 09:15:10 +02:00
196765e995
Use the new rope kernel in mistral. ( #1937 )
...
* Use the new rope kernel in mistral.
* Compute the cos and sin with full precision.
* Bugfix.
2024-03-25 23:26:05 +01:00
e2b4829531
Support more mistral models. ( #1927 )
...
* Support more mistral models.
* Use the appropriate rope parameter.
2024-03-24 08:04:04 +01:00
6f877592a7
Avoid broadcasting on the batch dimension for the attention mask. ( #1920 )
2024-03-23 13:08:53 +01:00
90fc82211f
Use a common with_tracing::RmsNorm in a few models. ( #1871 )
...
* Add RmsNorm with tracing.
* Use with_tracing::RmsNorm in some models.
2024-03-18 21:40:06 +01:00
cd889c0f8a
add config_amazon_mistral_lite ( #1493 )
...
Co-authored-by: Ubuntu <danielclough@users.noreply.github.com >
2023-12-28 19:59:58 +01:00
f6408a3779
feat: add clear_kv_cache to mistral and qmistral models ( #1464 )
2023-12-21 21:19:19 +01:00
563a79afa1
make fn name generic ( #1459 )
...
Co-authored-by: Ubuntu <danielclough@users.noreply.github.com >
2023-12-21 02:16:31 +01:00
8ede5f4210
add fn config_chat_ml ( #1458 )
...
* add fn config_chat_ml
* Add a link to the original config.
---------
Co-authored-by: Ubuntu <danielclough@users.noreply.github.com >
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-12-20 21:03:24 +01:00
185b54a33b
Make some model cloneable. ( #1125 )
2023-10-18 19:30:47 +01:00
deee7612da
Quantized version of mistral. ( #1009 )
...
* Quantized version of mistral.
* Integrate the quantized mistral variant.
* Use the quantized weight files.
* Tweak the quantization command.
* Fix the dtype when computing the rotary embeddings.
* Update the readme with the quantized version.
* Fix the decoding of the remaining tokens.
2023-09-30 18:25:47 +01:00
4021272875
Use flash-attn for mistral. ( #1004 )
2023-09-30 12:15:10 +01:00
53510ce427
Use a silu activation in mistral. ( #991 )
2023-09-29 07:06:54 +01:00
23b3576c47
Add the sliding window. ( #986 )
2023-09-28 17:26:33 +01:00
716ab2ccdc
Mistral gpu fix ( #985 )
...
* Add the mistral example.
* Use the two model files.
* Adjust the dtype.
* Tweak the weight paths.
* Remove the end of text token.
* Get the mistral model to generate some text.
* Fix when running on the gpu.
* More gpu fixes.
2023-09-28 16:38:13 +01:00
ada8851a23
Add the mistral example. ( #984 )
...
* Add the mistral example.
* Use the two model files.
* Adjust the dtype.
* Tweak the weight paths.
* Remove the end of text token.
* Get the mistral model to generate some text.
2023-09-28 16:19:18 +01:00
c05a348e36
Add the Mistral 7b model ( #983 )
...
* Start sketching the mistral 7b model.
* Add the kv cache.
* Add the decoder layer.
* Add the mistral model.
* Rotary embeddings.
* Add the attention mask.
2023-09-28 14:29:41 +01:00