Add the recurrent-gemma model. (#2039)

* Start adding the recurrent-gemma model. * More griffin. * Add the example + get the weights to load from the HF version. * More inference code. * Rope + kv-cache on the attention side. * Add to the inference code. * Add more to the recurrent gemma inference. * Get some first inference to run. * Add the softcap on logits. * Fixes. * Use partial rotary embeddings. * Get inference to work. * Add a comment. * And add a readme.
2025-06-22 12:28:06 +00:00 · 2024-04-13 00:05:21 +02:00
parent 3ad4770eb6
commit 2bf413caa3
4 changed files with 915 additions and 0 deletions
--- a/candle-examples/examples/recurrent-gemma/README.md
+++ b/candle-examples/examples/recurrent-gemma/README.md
@ -0,0 +1,9 @@
+# candle-recurrent-gemma
+
+This model card corresponds to the 2B base version of the RecurrentGemma model
+[huggingface model card](https://huggingface.co/google/recurrentgemma-2b).
+
+```bash
+cargo run --features cuda -r --example recurrent-gemma -- \
+    --prompt "Write me a poem about Machine Learning."  
+```