Improved mamba model optimized for inference (#1694)

* Sketch the mamba model for inference.

* Complete the forward pass.

* Add the mamba example.

* Optimize the selective-scan part.

* Fix a couple shape mismatches and get inference to work.

* Tweak the readmes.

* More readme tweaks.
This commit is contained in:
Laurent Mazare
2024-02-11 17:04:57 +01:00
committed by GitHub
parent 74497e6bf7
commit 1e26d539d9
6 changed files with 533 additions and 2 deletions

View File

@ -67,7 +67,7 @@ We also provide a some command line based examples using state of the art models
- [StableLM-3B-4E1T](./candle-examples/examples/stable-lm/): a 3b general LLM
pre-trained on 1T tokens of English and code datasets. Also supports
StableLM-2, a 1.6b LLM trained on 2T tokens, as well as the code variants.
- [Minimal Mamba](./candle-examples/examples/mamba-minimal/): a minimal
- [Mamba](./candle-examples/examples/mamba/): an inference only
implementation of the Mamba state space model.
- [Mistral7b-v0.1](./candle-examples/examples/mistral/): a 7b general LLM with
better performance than all publicly available 13b models as of 2023-09-28.
@ -186,7 +186,7 @@ If you have an addition to this list, please submit a pull request.
- Falcon.
- StarCoder.
- Phi 1, 1.5, and 2.
- Minimal Mamba
- Mamba, Minimal Mamba
- Mistral 7b v0.1.
- Mixtral 8x7b v0.1.
- StableLM-3B-4E1T, StableLM-2-1.6B, Stable-Code-3B.