4c967b9184
Use the hub files for the marian example. ( #1220 )
...
* Use the hub files for the marian example.
* Use the secondary decoder.
* Add a readme.
* More readme.
2023-10-30 17:29:36 +00:00
969960847a
Bugfixes for marian-mt. ( #1219 )
...
* Bugfixes for marian-mt.
* Apply the final decoding head.
* More fixes.
2023-10-30 11:44:19 +00:00
174b208052
PyO3: Better shape handling ( #1143 )
...
* Negative and `*args` shape handling
* Rename to `PyShapeWithHole` + validate that only one hole exists
* Regenerate stubs
---------
Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com >
2023-10-29 15:41:44 +00:00
7bbde55c61
Marian MT model ( #1210 )
...
* Skeleton files for the marian MT model.
* Marian initialization.
* Implement the attention forward method.
* Forward pass for the encoder side.
* Expose the encoder and decoder.
* Start plugging the decoder.
* Forward pass for the decoder layer.
* Set up the marian example.
* Add some missing backtraces.
* Bugfix.
2023-10-29 15:12:22 +00:00
55bc3382cf
Allow for different behavior between training and eval ( #1213 )
...
* Forward with training.
* Do not use dropout on vgg evaluation.
2023-10-29 07:53:09 +01:00
dece37c6f4
feat: implement VGG13, VGG16 and VGG19 ( #1211 )
...
* feat: implement VGG13, VGG16 and VGG19
* Cosmetic fixes.
* More cosmetic tweaks + avoid re-loading the weights on each final layer.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com >
2023-10-29 06:10:23 +00:00
498c50348c
Add DDPG and fix Gym wrapper ( #1207 )
...
* Fix Gym wrapper
- It was returning things in the wrong order
- Gym now differentiates between terminated and truncated
* Add DDPG
* Apply fixes
* Remove Result annotations
* Also remove Vec annotation
* rustfmt
* Various small improvements (avoid cloning, mutability, get clippy to pass, ...)
---------
Co-authored-by: Travis Hammond <travis.hammond@alexanderthamm.com >
Co-authored-by: Laurent <laurent.mazare@gmail.com >
2023-10-28 19:53:34 +01:00
012ae0090e
Infer the config for llama2-c. ( #1208 )
2023-10-28 19:00:39 +01:00
95a857cf57
Move the llama2-c model in transformers. ( #1205 )
2023-10-28 16:51:19 +01:00
b3181455d5
Add fuse-conv-bn method for Conv2d ( #1196 )
...
* Add fuse-conv-bn method for Conv2d
* no unwrap
* run rustfmp and clippy
2023-10-27 15:56:50 +01:00
e2826e70b3
Add a quantized variant of llama2.c ( #1197 )
...
* Add a quantized variant of llama2.c
* Clippy fixes.
2023-10-27 15:34:06 +01:00
70d06ab4b0
Add support for the phi-hermes finetuned model. ( #1192 )
2023-10-27 05:57:08 +01:00
0ec5ebcec4
Use the hub model file when possible. ( #1190 )
...
* Use the hub model file when possible.
* And add a mention in the main readme.
2023-10-26 20:00:50 +01:00
5f20697918
Add the jina-bert embeddings model. ( #1187 )
...
* Add the jina-bert model.
* Use alibi.
* Remove the unused pragma.
* Recompute the alibi embeddings.
* Generate the token type ids.
* Use the module trait.
* Add the jina-bert example.
* DType fix.
* Get the inference to work.
2023-10-26 16:54:36 +01:00
25c3cc4149
Mention the flash-attention restriction in the readme. ( #1158 )
2023-10-23 10:26:56 +01:00
a11af79e23
Add a quantized blip model. ( #1155 )
...
* Add a quantized blip model.
* Integrate the quantized blip model to the actual example.
2023-10-22 20:33:25 +01:00
8a82d623e5
Handle LongStorage in pytorch checkpoints. ( #1152 )
2023-10-22 18:34:36 +01:00
df2f89b6cf
Add some KV cache to blip. ( #1150 )
...
* Add some KV cache to blip.
* Mention BLIP in the readme.
2023-10-22 09:44:48 +01:00
3115fe42e4
Blip attention mask + readme ( #1146 )
...
* Add the attention mask to the blip model.
* Add a readme.
2023-10-21 22:44:13 +01:00
2531b13bf8
Blip fixes ( #1145 )
...
* Some fixes for the blip example.
* Stop generating on sep tokens.
* Clippy fixes.
* rustfmt.
2023-10-21 21:34:48 +01:00
0d9bb4eb18
Add the blip example. ( #1144 )
...
* Add the blip example.
* Tweak the example.
* Implement the cross-attn logic.
* Fix some shape mismatches.
* Get some logits out.
* Get some caption to be generated.
2023-10-21 20:05:02 +01:00
7366aeac21
Make func cloneable. ( #1137 )
2023-10-20 16:28:50 +01:00
31ca4897bb
Readme updates. ( #1134 )
2023-10-20 09:08:39 +01:00
55351ef57d
Add some vision transformers models ( #1132 )
...
* Start adding vision-transformers.
* Add self-attn.
* More vision transformers.
* vit-vit.
* Add the actual vit model.
* Add the example code for the vision transformers.
2023-10-19 22:24:18 +01:00
93c25e8844
Expose the larger resnets (50/101/152) in the example. ( #1131 )
2023-10-19 13:48:28 +01:00
6f76383f38
Add a readme for the resnet example. ( #1129 )
2023-10-19 09:58:50 +01:00
8e773cc0c6
Experiment with resnet ( #1128 )
...
* Add some preliminary support for resnet.
* Add an actual resnet example.
2023-10-19 09:25:03 +01:00
620c94d12e
Add support for Zephyr-7b in the quantized model. ( #1124 )
2023-10-18 17:31:26 +01:00
86e7d539d2
Add the quantized mpt model. ( #1123 )
...
* Add the quantized mpt model.
* Support the quantized model for replit-code.
2023-10-18 16:29:38 +01:00
63c204c79e
Add a mention to the replit-code model in the readme. ( #1121 )
2023-10-18 11:27:23 +01:00
767a6578f1
MPT alibi fixes. ( #1120 )
...
* MPT alibi fixes.
* Some more fixes.
* Finally get the model to return some sensible outputs.
* Add a readme.
2023-10-18 10:58:05 +01:00
2cd745a97c
MPT fixes. ( #1117 )
...
* MPT fixes.
* Another couple fixes.
* Another shape fix.
2023-10-17 21:53:31 +01:00
a72b50e2c0
Build alibi bias. ( #1115 )
...
* Build alibi bias.
* Apply the alibi attention bias.
* Add the replit-code example.
2023-10-17 20:41:37 +01:00
00948eb656
Formatting tweak. ( #1111 )
2023-10-16 21:02:53 +01:00
af67672207
Add support for Puffin-Phi-v2. ( #1110 )
...
* Add support for Puffin-Phi-v2.
* Tweak the file name.
* Support the config for puffin-phi-v2.
* Update the readme.
2023-10-16 20:54:21 +01:00
588ad4835a
Fix the verbose prompt for phi. ( #1097 )
2023-10-15 10:53:25 +01:00
b73c35cc57
Improve the reshape error messages. ( #1096 )
...
* Improve the reshape error messages.
* Add the verbose-prompt flag to the phi example.
2023-10-15 10:43:10 +01:00
8921d5027c
Add support for phi-1.0 ( #1093 )
...
* Add support for phi-1.0
* Update the readme.
2023-10-14 20:15:43 +01:00
29c7f2565d
Add some reinforcement learning example. ( #1090 )
...
* Add some reinforcement learning example.
* Python initialization.
* Get the example to run.
* Vectorized gym envs for the atari wrappers.
* Get some simulation loop to run.
2023-10-14 16:46:43 +01:00
e7560443e4
Convmixer example ( #1074 )
...
* Add a convmixer based example.
* Mention the model in the readme.
2023-10-11 19:51:10 +01:00
b34d7f0248
Remove some unusued bits. ( #1067 )
2023-10-09 19:49:57 +01:00
4d04ac83c7
Override the repo for SDXL f16 vae weights. ( #1064 )
...
* Override the repo for SDXL f16 vae weights.
* Slightly simpler change.
2023-10-09 06:52:28 +01:00
59ab6d7832
Quantized version of StableLM. ( #1058 )
...
* Quantized version of StableLM.
* Adapt the stable-lm example to support quantizsed.
* Use some separate hub repo.
* Another repo name tweak.
2023-10-08 15:42:38 +01:00
2e5fb0b251
Do not use the kv-cache on external key-value states. ( #1054 )
2023-10-07 22:37:19 +01:00
823fe23f9b
Add flash-attn support for stable-lm. ( #1052 )
2023-10-07 21:12:54 +01:00
d833527fda
Use candle_nn::LSTM in encodec. ( #1051 )
...
* Use candle_nn::LSTM in encodec.
* More Encodec implementation.
* Decoder implementation.
2023-10-07 19:43:06 +01:00
955e00b2e8
Add to the readmes for stable-lm. ( #1047 )
2023-10-06 21:26:04 +01:00
d5f7267087
Add the stable-lm example. ( #1046 )
...
* Add the stable-lm example.
* Get stable-lm to generate some proper text.
2023-10-06 19:20:35 +01:00
4631c48273
Remove some todos. ( #1042 )
2023-10-05 22:42:20 +01:00
716883e9b0
Add the clamping for stable-diffusion. ( #1041 )
2023-10-05 22:20:39 +01:00