a11af79e23
Add a quantized blip model. ( #1155 )
...
* Add a quantized blip model.
* Integrate the quantized blip model to the actual example.
2023-10-22 20:33:25 +01:00
df2f89b6cf
Add some KV cache to blip. ( #1150 )
...
* Add some KV cache to blip.
* Mention BLIP in the readme.
2023-10-22 09:44:48 +01:00
5b32c2a41e
Remove the unused pragma and properly apply the bias. ( #1147 )
2023-10-22 06:47:40 +01:00
3115fe42e4
Blip attention mask + readme ( #1146 )
...
* Add the attention mask to the blip model.
* Add a readme.
2023-10-21 22:44:13 +01:00
2531b13bf8
Blip fixes ( #1145 )
...
* Some fixes for the blip example.
* Stop generating on sep tokens.
* Clippy fixes.
* rustfmt.
2023-10-21 21:34:48 +01:00
0d9bb4eb18
Add the blip example. ( #1144 )
...
* Add the blip example.
* Tweak the example.
* Implement the cross-attn logic.
* Fix some shape mismatches.
* Get some logits out.
* Get some caption to be generated.
2023-10-21 20:05:02 +01:00
94e3373883
Blip forward pass ( #1141 )
...
* More forward methods for the blip model.
* Blipping continues.
2023-10-21 10:19:23 +01:00
34d9e91748
Add the blip image captioning model ( #1140 )
...
* Blip text model.
* Blip vision bits.
* Blippity.
* More blip.
2023-10-20 22:09:11 +01:00
55351ef57d
Add some vision transformers models ( #1132 )
...
* Start adding vision-transformers.
* Add self-attn.
* More vision transformers.
* vit-vit.
* Add the actual vit model.
* Add the example code for the vision transformers.
2023-10-19 22:24:18 +01:00
cd53c472df
Support ResNet 50/101/152. ( #1130 )
2023-10-19 10:48:31 +01:00
8e773cc0c6
Experiment with resnet ( #1128 )
...
* Add some preliminary support for resnet.
* Add an actual resnet example.
2023-10-19 09:25:03 +01:00
902d0b9166
More model cloning. ( #1126 )
...
* More model cloning.
* More cloning on quantized models.
2023-10-18 21:55:46 +01:00
185b54a33b
Make some model cloneable. ( #1125 )
2023-10-18 19:30:47 +01:00
86e7d539d2
Add the quantized mpt model. ( #1123 )
...
* Add the quantized mpt model.
* Support the quantized model for replit-code.
2023-10-18 16:29:38 +01:00
cb034506cd
Remove the unused pragma in mpt. ( #1122 )
2023-10-18 15:47:50 +01:00
767a6578f1
MPT alibi fixes. ( #1120 )
...
* MPT alibi fixes.
* Some more fixes.
* Finally get the model to return some sensible outputs.
* Add a readme.
2023-10-18 10:58:05 +01:00
2cd745a97c
MPT fixes. ( #1117 )
...
* MPT fixes.
* Another couple fixes.
* Another shape fix.
2023-10-17 21:53:31 +01:00
a72b50e2c0
Build alibi bias. ( #1115 )
...
* Build alibi bias.
* Apply the alibi attention bias.
* Add the replit-code example.
2023-10-17 20:41:37 +01:00
872c3f14b0
Add the MPT model. ( #1114 )
...
* Add the MPT model.
* Add ffn and block.
* Forward pass for the mpt block.
* Repeat-kv.
2023-10-17 16:06:48 +01:00
af67672207
Add support for Puffin-Phi-v2. ( #1110 )
...
* Add support for Puffin-Phi-v2.
* Tweak the file name.
* Support the config for puffin-phi-v2.
* Update the readme.
2023-10-16 20:54:21 +01:00
89b525b5e7
Convmixer ( #1073 )
...
* Only optimize float tensors.
* Use full tensors for zeros and ones.
* Add a benchmark for the matmul slowness.
* Add the convmixer model.
* Proper adaptive pooling.
2023-10-11 18:24:32 +01:00
bc3351bce4
Tracing for StableLM and quantized StableLM. ( #1068 )
2023-10-10 08:09:25 +02:00
392fe02fba
Move the common quantized-nn code to a shared module. ( #1063 )
2023-10-09 06:22:22 +01:00
59ab6d7832
Quantized version of StableLM. ( #1058 )
...
* Quantized version of StableLM.
* Adapt the stable-lm example to support quantizsed.
* Use some separate hub repo.
* Another repo name tweak.
2023-10-08 15:42:38 +01:00
783735cf22
Use softmax-last-dim where possible. ( #1057 )
2023-10-08 13:16:42 +01:00
2e5fb0b251
Do not use the kv-cache on external key-value states. ( #1054 )
2023-10-07 22:37:19 +01:00
823fe23f9b
Add flash-attn support for stable-lm. ( #1052 )
2023-10-07 21:12:54 +01:00
aa53368aeb
Better control on the optional dequantization in QMatMul ( #1049 )
...
* Cosmetic change to the quantized whisper model.
* Fix the dequantization.
* Add the dequantize all variable.
2023-10-07 10:16:18 +01:00
d5f7267087
Add the stable-lm example. ( #1046 )
...
* Add the stable-lm example.
* Get stable-lm to generate some proper text.
2023-10-06 19:20:35 +01:00
b0442eff8a
Sketch the stable-lm model. ( #1045 )
2023-10-06 18:19:06 +01:00
4631c48273
Remove some todos. ( #1042 )
2023-10-05 22:42:20 +01:00
f47bd9bab5
Delete invalid comment ( #1038 )
2023-10-05 19:28:08 +01:00
089fc3b584
Improve the quantized whisper setup. ( #1018 )
...
* Improve the quantized whisper setup.
* Fix the config file paths.
* Use the standard matmul where possible.
2023-10-02 17:17:46 +01:00
e04c789230
Add a quantized variant of whisper ( #1017 )
...
* Add the quantized-whisper model.
* Quantized the whisper model.
* Adapt the whisper example to handle quantization.
* Add the quantized flag.
* Load the proper weights.
2023-10-02 14:59:53 +01:00
096dee7073
Bump the version to 0.3.0. ( #1014 )
...
* Bump the version to 0.3.0.
* Changelog update.
2023-10-01 13:51:57 +01:00
deee7612da
Quantized version of mistral. ( #1009 )
...
* Quantized version of mistral.
* Integrate the quantized mistral variant.
* Use the quantized weight files.
* Tweak the quantization command.
* Fix the dtype when computing the rotary embeddings.
* Update the readme with the quantized version.
* Fix the decoding of the remaining tokens.
2023-09-30 18:25:47 +01:00
4021272875
Use flash-attn for mistral. ( #1004 )
2023-09-30 12:15:10 +01:00
6203ced495
Add negative prompts to segment-anything. ( #1000 )
2023-09-30 06:17:42 +01:00
d188d6a764
Fix the multiple points case for sam. ( #998 )
2023-09-29 22:39:43 +02:00
53510ce427
Use a silu activation in mistral. ( #991 )
2023-09-29 07:06:54 +01:00
23b3576c47
Add the sliding window. ( #986 )
2023-09-28 17:26:33 +01:00
716ab2ccdc
Mistral gpu fix ( #985 )
...
* Add the mistral example.
* Use the two model files.
* Adjust the dtype.
* Tweak the weight paths.
* Remove the end of text token.
* Get the mistral model to generate some text.
* Fix when running on the gpu.
* More gpu fixes.
2023-09-28 16:38:13 +01:00
ada8851a23
Add the mistral example. ( #984 )
...
* Add the mistral example.
* Use the two model files.
* Adjust the dtype.
* Tweak the weight paths.
* Remove the end of text token.
* Get the mistral model to generate some text.
2023-09-28 16:19:18 +01:00
c05a348e36
Add the Mistral 7b model ( #983 )
...
* Start sketching the mistral 7b model.
* Add the kv cache.
* Add the decoder layer.
* Add the mistral model.
* Rotary embeddings.
* Add the attention mask.
2023-09-28 14:29:41 +01:00
ce0a4e3a85
Use the gelu-erf activation. ( #969 )
2023-09-26 22:30:21 +01:00
1fcac4afed
Expose a function to clear the KV cache on mixformers. ( #964 )
2023-09-26 05:41:07 +01:00
a36d883254
Use a single flag for the point argument. ( #958 )
2023-09-25 12:53:24 +01:00
7f2bbcf746
[segment-anything] Support multi-point as the prompt input ( #945 )
...
* [sam] Support multi-point prompts
* [segment-anything] Pass points by reference
* [segment-anything] Update example code and image
* Fix clippy lint.
---------
Co-authored-by: Yun Ding <yunding@nvidia.com >
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-09-25 12:14:10 +01:00
0007ae9c11
Add the quantized mixformer model. ( #953 )
...
* Add the quantized mixformer model.
* Add the quantized option in the phi example.
2023-09-24 15:03:48 +01:00
e15862cfdb
Shared the quantized var-builder code. ( #952 )
...
* Shared the quantized var-builder code.
* Fix compilation.
2023-09-24 12:55:07 +01:00