Commit Graph

561 Commits

Author SHA1 Message Date
63c204c79e Add a mention to the replit-code model in the readme. (#1121) 2023-10-18 11:27:23 +01:00
767a6578f1 MPT alibi fixes. (#1120)
* MPT alibi fixes.

* Some more fixes.

* Finally get the model to return some sensible outputs.

* Add a readme.
2023-10-18 10:58:05 +01:00
2cd745a97c MPT fixes. (#1117)
* MPT fixes.

* Another couple fixes.

* Another shape fix.
2023-10-17 21:53:31 +01:00
a72b50e2c0 Build alibi bias. (#1115)
* Build alibi bias.

* Apply the alibi attention bias.

* Add the replit-code example.
2023-10-17 20:41:37 +01:00
00948eb656 Formatting tweak. (#1111) 2023-10-16 21:02:53 +01:00
af67672207 Add support for Puffin-Phi-v2. (#1110)
* Add support for Puffin-Phi-v2.

* Tweak the file name.

* Support the config for puffin-phi-v2.

* Update the readme.
2023-10-16 20:54:21 +01:00
588ad4835a Fix the verbose prompt for phi. (#1097) 2023-10-15 10:53:25 +01:00
b73c35cc57 Improve the reshape error messages. (#1096)
* Improve the reshape error messages.

* Add the verbose-prompt flag to the phi example.
2023-10-15 10:43:10 +01:00
8921d5027c Add support for phi-1.0 (#1093)
* Add support for phi-1.0

* Update the readme.
2023-10-14 20:15:43 +01:00
29c7f2565d Add some reinforcement learning example. (#1090)
* Add some reinforcement learning example.

* Python initialization.

* Get the example to run.

* Vectorized gym envs for the atari wrappers.

* Get some simulation loop to run.
2023-10-14 16:46:43 +01:00
e7560443e4 Convmixer example (#1074)
* Add a convmixer based example.

* Mention the model in the readme.
2023-10-11 19:51:10 +01:00
b34d7f0248 Remove some unusued bits. (#1067) 2023-10-09 19:49:57 +01:00
4d04ac83c7 Override the repo for SDXL f16 vae weights. (#1064)
* Override the repo for SDXL f16 vae weights.

* Slightly simpler change.
2023-10-09 06:52:28 +01:00
59ab6d7832 Quantized version of StableLM. (#1058)
* Quantized version of StableLM.

* Adapt the stable-lm example to support quantizsed.

* Use some separate hub repo.

* Another repo name tweak.
2023-10-08 15:42:38 +01:00
2e5fb0b251 Do not use the kv-cache on external key-value states. (#1054) 2023-10-07 22:37:19 +01:00
823fe23f9b Add flash-attn support for stable-lm. (#1052) 2023-10-07 21:12:54 +01:00
d833527fda Use candle_nn::LSTM in encodec. (#1051)
* Use candle_nn::LSTM in encodec.

* More Encodec implementation.

* Decoder implementation.
2023-10-07 19:43:06 +01:00
955e00b2e8 Add to the readmes for stable-lm. (#1047) 2023-10-06 21:26:04 +01:00
d5f7267087 Add the stable-lm example. (#1046)
* Add the stable-lm example.

* Get stable-lm to generate some proper text.
2023-10-06 19:20:35 +01:00
4631c48273 Remove some todos. (#1042) 2023-10-05 22:42:20 +01:00
716883e9b0 Add the clamping for stable-diffusion. (#1041) 2023-10-05 22:20:39 +01:00
b86ac0c507 Quant t5: Add coedit model to wasm demo and readme (#1031) 2023-10-04 20:57:33 +01:00
3349c89252 Add quantized t5 args for weight and config (#1029) 2023-10-04 17:02:49 +01:00
11d3687cc6 Simd128 optimized q8k vecdot. (#1026) 2023-10-03 15:29:48 +01:00
089fc3b584 Improve the quantized whisper setup. (#1018)
* Improve the quantized whisper setup.

* Fix the config file paths.

* Use the standard matmul where possible.
2023-10-02 17:17:46 +01:00
e04c789230 Add a quantized variant of whisper (#1017)
* Add the quantized-whisper model.

* Quantized the whisper model.

* Adapt the whisper example to handle quantization.

* Add the quantized flag.

* Load the proper weights.
2023-10-02 14:59:53 +01:00
f6054e9d60 Fix the prompt for mistral when using instruct/interactive mode. (#1013) 2023-10-01 06:44:30 +01:00
328167ec04 Integrate TheBloke quantized mistral weights. (#1012) 2023-09-30 22:39:42 +01:00
deee7612da Quantized version of mistral. (#1009)
* Quantized version of mistral.

* Integrate the quantized mistral variant.

* Use the quantized weight files.

* Tweak the quantization command.

* Fix the dtype when computing the rotary embeddings.

* Update the readme with the quantized version.

* Fix the decoding of the remaining tokens.
2023-09-30 18:25:47 +01:00
06207332bc Streaming mode for reporting the generated tokens (#1007)
* Token streaming.

* Use the token output stream.

* Flush the output.

* Ensure that the last characters get reported.
2023-09-30 15:04:11 +01:00
4021272875 Use flash-attn for mistral. (#1004) 2023-09-30 12:15:10 +01:00
87e3a4e175 Mistral: exit on eos token. (#1001)
* Mistral: exit on eos token.

* Print the proper stats.

* Also add a short flag.
2023-09-30 07:07:06 +01:00
6203ced495 Add negative prompts to segment-anything. (#1000) 2023-09-30 06:17:42 +01:00
34842fb234 [segment-anything] Print IOU values to help with debugging (#999) 2023-09-30 05:44:42 +01:00
03348e2e6f Update mistral README.md (#995) 2023-09-29 12:24:32 +01:00
49fa184a35 Mistral readme (#994)
* Mistral: print the generated text.

* Add mistral to the readmes.
2023-09-29 11:50:50 +01:00
6f17ef82be Mistral: print the generated text. (#992) 2023-09-29 10:56:11 +01:00
ada8851a23 Add the mistral example. (#984)
* Add the mistral example.

* Use the two model files.

* Adjust the dtype.

* Tweak the weight paths.

* Remove the end of text token.

* Get the mistral model to generate some text.
2023-09-28 16:19:18 +01:00
2dd43d6cdd add eos token to phi example (#965)
* add eos token to phi example

* rustfmt + get the token directly.

---------

Co-authored-by: laurent <laurent.mazare@gmail.com>
2023-09-26 09:21:22 +01:00
c78a294323 Add some repeat penalty to the phi example. (#961) 2023-09-25 20:53:30 +01:00
a36d883254 Use a single flag for the point argument. (#958) 2023-09-25 12:53:24 +01:00
7f2bbcf746 [segment-anything] Support multi-point as the prompt input (#945)
* [sam] Support multi-point prompts

* [segment-anything] Pass points by reference

* [segment-anything] Update example code and image

* Fix clippy lint.

---------

Co-authored-by: Yun Ding <yunding@nvidia.com>
Co-authored-by: laurent <laurent.mazare@gmail.com>
2023-09-25 12:14:10 +01:00
1ce7fe2543 Add more examples to the phi readme. (#956) 2023-09-24 18:19:05 +01:00
f5069dd354 Use the repo for the quantized phi model. (#954) 2023-09-24 16:30:26 +01:00
0007ae9c11 Add the quantized mixformer model. (#953)
* Add the quantized mixformer model.

* Add the quantized option in the phi example.
2023-09-24 15:03:48 +01:00
bcb0ed8f1c Self-contained safetensors for the multiprocess llama example. (#950) 2023-09-24 06:54:49 +01:00
bb3471ea31 Adapt more examples to the updated safetensor api. (#947)
* Simplify the safetensor usage.

* Convert more examples.

* Move more examples.

* Adapt stable-diffusion.
2023-09-23 21:26:03 +01:00
890d069092 Self-contained safetensor wrappers (#946)
* Self-contained safetensor wrappers.

* Use the new safetensor container in varbuilders.
2023-09-23 20:39:52 +01:00
5dbe46b389 Add tracing. (#943) 2023-09-23 16:55:46 +01:00
b54acfa3d0 Tracing for the phi model (#936)
* Add some tracing bits to mixformers.

* Add the missing file.

* Add the conv2d layer to with-tracing.

* Improve the tracing usage.
2023-09-23 09:19:34 +01:00