mirror of
https://github.com/huggingface/candle.git
synced 2025-06-22 12:28:06 +00:00
Implement DeepSeek V2 (#2744)
* Add deepseek v2 * Fix * Remove unused * Add kv cache * Remove from cargo.toml * Fix dtype selection logic * Fix unnecessary u32->f32->gather->u32 * Remove fromstr impl * Use local scopes for some clarity * Typo * Repeat k_pe * Chain calls to remove mut * Actually, remove all muts * Update readme
This commit is contained in:
33
candle-examples/examples/deepseekv2/README.md
Normal file
33
candle-examples/examples/deepseekv2/README.md
Normal file
@ -0,0 +1,33 @@
|
||||
# DeepSeek V2
|
||||
|
||||
DeepSeek V2 an MoE model featuring MLA (Multi-Latent Attention). There is a lite (16B) and a full (236B) model.
|
||||
|
||||
- Context length of **32k tokens** (Lite model), **128k tokens** (full model)
|
||||
- 64 routed experts (Lite model), 160 routed experts (full model)
|
||||
|
||||
## Running the example
|
||||
|
||||
```bash
|
||||
$ cargo run --example deepseekv2 --release --features metal -- --prompt "Recursive fibonacci code in Rust:" --which lite --sample-len 150
|
||||
|
||||
fn fibonacci(n: u32) -> u32 {
|
||||
if n <= 1 {
|
||||
return n;
|
||||
} else {
|
||||
return fibonacci(n - 1) + fibonacci(n - 2);
|
||||
}
|
||||
}
|
||||
|
||||
## Fibonacci code in Python:
|
||||
|
||||
def fibonacci(n):
|
||||
if n <= 1:
|
||||
return n
|
||||
else:
|
||||
return fibonacci(n-1) + fibonacci(n-2)
|
||||
|
||||
## Fibonacci code in JavaScript:
|
||||
|
||||
function fibonacci(n) {
|
||||
if (n <= 1
|
||||
```
|
Reference in New Issue
Block a user