diff --git a/README.md b/README.md index 64e1c451..af308ede 100644 --- a/README.md +++ b/README.md @@ -62,6 +62,8 @@ We also provide a some command line based examples using state of the art models - [LLaMA and LLaMA-v2](./candle-examples/examples/llama/): general LLM. - [Falcon](./candle-examples/examples/falcon/): general LLM. - [Phi-v1.5](./candle-examples/examples/phi/): a 1.3b general LLM with performance on par with LLaMA-v2 7b. +- [StableLM-3B-4E1T](./candle-examples/examples/stable-lm/): a 3b general LLM + pre-trained on 1T tokens of English and code datasets. - [Mistral7b-v0.1](./candle-examples/examples/mistral/): a 7b general LLM with performance larger than all publicly available 13b models as of 2023-09-28. - [StarCoder](./candle-examples/examples/bigcode/): LLM specialized to code generation. @@ -152,6 +154,7 @@ If you have an addition to this list, please submit a pull request. - StarCoder. - Phi v1.5. - Mistral 7b v0.1. + - StableLM-3B-4E1T. - T5. - Bert. - Whisper (multi-lingual support). diff --git a/candle-examples/examples/stable-lm/README.md b/candle-examples/examples/stable-lm/README.md new file mode 100644 index 00000000..ad3e4a5b --- /dev/null +++ b/candle-examples/examples/stable-lm/README.md @@ -0,0 +1,25 @@ +# candle-stable-lm + +StableLM-3B-4E1T is a 3 billion parameter decoder-only language model +pre-trained on 1 trillion tokens of diverse English and code datasets for 4 +epochs. See the [HuggingFace Hub Model +Card](https://huggingface.co/stabilityai/stablelm-3b-4e1t). + +Note that this model is gated so you will have to request access on the Hub in +order to be able to use it. + +## Running some example + +```bash +$ cargo run --example stable-lm --release --features cuda -- --prompt 'What is the most efficient programming language in use?' --sample-len 150 +avx: true, neon: false, simd128: false, f16c: true +temp: 0.00 repeat-penalty: 1.10 repeat-last-n: 64 +retrieved the files in 126.593µs +loaded the model in 3.474148965s +What is the most efficient programming language in use? +The answer to this question depends on what you mean by "efficient". If you're talking about speed, then C++ and Java are probably your best bets. But if you're talking about ease of development, then Python is probably the way to go. +Python is a high-level, interpreted language that is easy to learn and use. It has a large community of developers who are always working on new features and improvements. +C++ is a low-level, compiled language that can be used for both desktop applications and web development. It's more difficult to learn than Python but offers greater control over the code. +Java is another high-level language that is popular with programmers because it runs on many different platforms (including Android phones +150 tokens generated (37.61 token/s) +```