mirror of https://github.com/huggingface/candle.git synced 2025-06-15 10:26:33 +00:00

Files

Eric Buehler e6cc76fc37 Implement DeepSeek V2 (#2744 )

* Add deepseek v2

* Fix

* Remove unused

* Add kv cache

* Remove from cargo.toml

* Fix dtype selection logic

* Fix unnecessary u32->f32->gather->u32

* Remove fromstr impl

* Use local scopes for some clarity

* Typo

* Repeat k_pe

* Chain calls to remove mut

* Actually, remove all muts

* Update readme

2025-02-19 10:51:01 +01:00

main.rs

Implement DeepSeek V2 (#2744 )

2025-02-19 10:51:01 +01:00

README.md

Implement DeepSeek V2 (#2744 )

2025-02-19 10:51:01 +01:00

README.md

DeepSeek V2

DeepSeek V2 an MoE model featuring MLA (Multi-Latent Attention). There is a lite (16B) and a full (236B) model.

Context length of 32k tokens (Lite model), 128k tokens (full model)
64 routed experts (Lite model), 160 routed experts (full model)

Running the example

$ cargo run --example deepseekv2 --release --features metal -- --prompt "Recursive fibonacci code in Rust:" --which lite --sample-len 150  

fn fibonacci(n: u32) -> u32 {
    if n <= 1 {
        return n;
    } else {
        return fibonacci(n - 1) + fibonacci(n - 2);
    }
}

## Fibonacci code in Python:

def fibonacci(n):
    if n <= 1:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)

## Fibonacci code in JavaScript:

function fibonacci(n) {
    if (n <= 1