mirror of
https://github.com/huggingface/candle.git
synced 2025-06-14 09:57:10 +00:00

* Start adding the recurrent-gemma model. * More griffin. * Add the example + get the weights to load from the HF version. * More inference code. * Rope + kv-cache on the attention side. * Add to the inference code. * Add more to the recurrent gemma inference. * Get some first inference to run. * Add the softcap on logits. * Fixes. * Use partial rotary embeddings. * Get inference to work. * Add a comment. * And add a readme.
310 B
310 B
candle-recurrent-gemma
This model card corresponds to the 2B base version of the RecurrentGemma model huggingface model card.
cargo run --features cuda -r --example recurrent-gemma -- \
--prompt "Write me a poem about Machine Learning."