mirror of
https://github.com/huggingface/candle.git
synced 2025-06-16 10:38:54 +00:00
Support for MQA for llama v2. (#205)
* Support for MQA for llama v2. * More llama-v2. * Move the rotary embedding precomputation in the cache. * Add a v2 flag. * Use the hf model.
This commit is contained in:
@ -13,7 +13,7 @@ let c = a.matmul(&b)?;
|
||||
Check out our [examples](./candle-examples/examples/):
|
||||
|
||||
- [Whisper](./candle-examples/examples/whisper/)
|
||||
- [Llama](./candle-examples/examples/llama/)
|
||||
- [Llama and Llama-v2](./candle-examples/examples/llama/)
|
||||
- [Bert](./candle-examples/examples/bert/) (Useful for sentence embeddings)
|
||||
- [Falcon](./candle-examples/examples/falcon/)
|
||||
|
||||
|
Reference in New Issue
Block a user