Improved mamba model optimized for inference (#1694)

* Sketch the mamba model for inference. * Complete the forward pass. * Add the mamba example. * Optimize the selective-scan part. * Fix a couple shape mismatches and get inference to work. * Tweak the readmes. * More readme tweaks.
2025-06-15 02:16:37 +00:00 · 2024-02-11 17:04:57 +01:00
parent 74497e6bf7
commit 1e26d539d9
6 changed files with 533 additions and 2 deletions
--- a/README.md
+++ b/README.md
@ -67,7 +67,7 @@ We also provide a some command line based examples using state of the art models
 - [StableLM-3B-4E1T](./candle-examples/examples/stable-lm/): a 3b general LLM
  pre-trained on 1T tokens of English and code datasets. Also supports
  StableLM-2, a 1.6b LLM trained on 2T tokens, as well as the code variants.
- [Minimal Mamba](./candle-examples/examples/mamba-minimal/): a minimal
+- [Mamba](./candle-examples/examples/mamba/): an inference only
  implementation of the Mamba state space model.
 - [Mistral7b-v0.1](./candle-examples/examples/mistral/): a 7b general LLM with
  better performance than all publicly available 13b models as of 2023-09-28.
@ -186,7 +186,7 @@ If you have an addition to this list, please submit a pull request.
        - Falcon.
        - StarCoder.
        - Phi 1, 1.5, and 2.
-        - Minimal Mamba
+        - Mamba, Minimal Mamba
        - Mistral 7b v0.1.
        - Mixtral 8x7b v0.1.
        - StableLM-3B-4E1T, StableLM-2-1.6B, Stable-Code-3B.