cbf5fc80c2
Add Gemma 3 1b IT toe Gemma examples ( #2809 )
...
- Updates the Gemma example to include Gemma 3 1b instruction tuned.
2025-03-16 17:00:48 +01:00
111edbc4ea
Gemma 3 initial setup (text only). ( #2802 )
...
* Gemma 3 initial setup (text only).
* Use the rotating kv cache for the sliding window.
2025-03-14 07:56:02 +01:00
c1b9e07e35
Add support for gemma-2. ( #2425 )
...
* Add gemma-2.
* Support a couple more models.
* Sliding window support.
* Example + readme updates.
* Update the main readme.
2024-08-17 20:31:23 +02:00
7ebc3548e1
Use flash-attn in gemma. ( #2195 )
...
* Use flash-attn in gemma.
* Fix flash-attn for head dim 256.
2024-05-18 19:18:59 +02:00
a0460cd2b1
Add the code-gemma models. ( #2038 )
...
* Add the code-gemma models.
* Tweak to the gemma config.
2024-04-10 21:19:21 +02:00
33c9b66554
Add the new gemma models. ( #2023 )
...
* Add the new gemma models.
* Revert the lightning changes.
* Support for the 1.1 models.
2024-04-06 21:25:38 +02:00
3318fe30fb
Update gemma README ( #1843 )
...
* Update gemma README
* Fixit
2024-03-13 21:41:36 +01:00
21f1d04976
Add the instruction finetuned gemma variants. ( #1790 )
2024-03-02 18:56:59 +01:00
8d04f70f4d
Fix the eos token for gemma. ( #1753 )
2024-02-24 11:07:02 +01:00
45d5322d62
Add the Gemma models. ( #1741 )
...
* Add the Gemma models.
* Add the gemma example.
* Adapt the RmsNorm.
* Get the 2b model to work.
* 7b support.
* Use the config head dim.
* Yet another fix.
* Make the matrixes contiguous.
* Also get the 7b model to work.
* And add to the readme.
2024-02-21 22:02:50 +01:00