candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 18:48:51 +00:00

Files

Laurent Mazare 12d6dc018d Support for MQA for llama v2. (#205 )

* Support for MQA for llama v2.

* More llama-v2.

* Move the rotary embedding precomputation in the cache.

* Add a v2 flag.

* Use the hf model.

2023-07-20 06:39:04 +01:00

convert_checkpoint.py

2023-07-11 19:32:10 +01:00

main.rs

2023-07-20 06:39:04 +01:00

model.rs

2023-07-20 06:39:04 +01:00