Files
candle/candle-examples/examples/llama
Laurent Mazare 12d6dc018d Support for MQA for llama v2. (#205)
* Support for MQA for llama v2.

* More llama-v2.

* Move the rotary embedding precomputation in the cache.

* Add a v2 flag.

* Use the hf model.
2023-07-20 06:39:04 +01:00
..
2023-07-20 06:39:04 +01:00
2023-07-20 06:39:04 +01:00