|
e6cc76fc37
|
Implement DeepSeek V2 (#2744)
* Add deepseek v2
* Fix
* Remove unused
* Add kv cache
* Remove from cargo.toml
* Fix dtype selection logic
* Fix unnecessary u32->f32->gather->u32
* Remove fromstr impl
* Use local scopes for some clarity
* Typo
* Repeat k_pe
* Chain calls to remove mut
* Actually, remove all muts
* Update readme
|
2025-02-19 10:51:01 +01:00 |
|