mirror of
https://github.com/huggingface/candle.git
synced 2025-06-15 18:28:24 +00:00

* Start adding the recurrent-gemma model. * More griffin. * Add the example + get the weights to load from the HF version. * More inference code. * Rope + kv-cache on the attention side. * Add to the inference code. * Add more to the recurrent gemma inference. * Get some first inference to run. * Add the softcap on logits. * Fixes. * Use partial rotary embeddings. * Get inference to work. * Add a comment. * And add a readme.
10 lines
310 B
Markdown
10 lines
310 B
Markdown
# candle-recurrent-gemma
|
|
|
|
This model card corresponds to the 2B base version of the RecurrentGemma model
|
|
[huggingface model card](https://huggingface.co/google/recurrentgemma-2b).
|
|
|
|
```bash
|
|
cargo run --features cuda -r --example recurrent-gemma -- \
|
|
--prompt "Write me a poem about Machine Learning."
|
|
```
|