cd53c472df
Support ResNet 50/101/152. ( #1130 )
2023-10-19 10:48:31 +01:00
6f76383f38
Add a readme for the resnet example. ( #1129 )
2023-10-19 09:58:50 +01:00
8e773cc0c6
Experiment with resnet ( #1128 )
...
* Add some preliminary support for resnet.
* Add an actual resnet example.
2023-10-19 09:25:03 +01:00
87eb1658e1
Add pad_with_same. ( #1127 )
...
* More model cloning.
* More cloning on quantized models.
* Add pad-with-same.
* Add some tests.
2023-10-18 23:13:37 +01:00
902d0b9166
More model cloning. ( #1126 )
...
* More model cloning.
* More cloning on quantized models.
2023-10-18 21:55:46 +01:00
185b54a33b
Make some model cloneable. ( #1125 )
2023-10-18 19:30:47 +01:00
620c94d12e
Add support for Zephyr-7b in the quantized model. ( #1124 )
2023-10-18 17:31:26 +01:00
86e7d539d2
Add the quantized mpt model. ( #1123 )
...
* Add the quantized mpt model.
* Support the quantized model for replit-code.
2023-10-18 16:29:38 +01:00
cb034506cd
Remove the unused pragma in mpt. ( #1122 )
2023-10-18 15:47:50 +01:00
63c204c79e
Add a mention to the replit-code model in the readme. ( #1121 )
2023-10-18 11:27:23 +01:00
767a6578f1
MPT alibi fixes. ( #1120 )
...
* MPT alibi fixes.
* Some more fixes.
* Finally get the model to return some sensible outputs.
* Add a readme.
2023-10-18 10:58:05 +01:00
662c186fd5
Better error message when overflowing in narrow. ( #1119 )
2023-10-18 08:40:14 +01:00
2cd745a97c
MPT fixes. ( #1117 )
...
* MPT fixes.
* Another couple fixes.
* Another shape fix.
2023-10-17 21:53:31 +01:00
a72b50e2c0
Build alibi bias. ( #1115 )
...
* Build alibi bias.
* Apply the alibi attention bias.
* Add the replit-code example.
2023-10-17 20:41:37 +01:00
872c3f14b0
Add the MPT model. ( #1114 )
...
* Add the MPT model.
* Add ffn and block.
* Forward pass for the mpt block.
* Repeat-kv.
2023-10-17 16:06:48 +01:00
f9e93f5b69
Extend stub.py
to accept external typehinting ( #1102 )
2023-10-17 11:07:26 +01:00
b355ab4e2e
Always broadcast magic methods ( #1101 )
2023-10-17 10:57:12 +01:00
2fe24ac5b1
Rework the cuda casting bits. ( #1112 )
2023-10-17 09:44:51 +01:00
00948eb656
Formatting tweak. ( #1111 )
2023-10-16 21:02:53 +01:00
af67672207
Add support for Puffin-Phi-v2. ( #1110 )
...
* Add support for Puffin-Phi-v2.
* Tweak the file name.
* Support the config for puffin-phi-v2.
* Update the readme.
2023-10-16 20:54:21 +01:00
6c588c4792
Refactor the pth tensor exctraction. ( #1109 )
2023-10-16 18:16:34 +01:00
122da87580
feat: add pth varbuilder ( #1108 )
2023-10-16 16:20:36 +01:00
75629981bc
feat: parse Cuda compute cap from env ( #1066 )
...
* feat: add support for multiple compute caps
* Revert to one compute cap
* fmt
* fix
2023-10-16 15:37:38 +01:00
0106b0b04c
Read all the tensors in a PyTorch pth file. ( #1106 )
2023-10-16 13:50:07 +01:00
588ad4835a
Fix the verbose prompt for phi. ( #1097 )
2023-10-15 10:53:25 +01:00
b73c35cc57
Improve the reshape error messages. ( #1096 )
...
* Improve the reshape error messages.
* Add the verbose-prompt flag to the phi example.
2023-10-15 10:43:10 +01:00
8f310cc666
Avoid trying to backprop through non-differentiable layers. ( #1094 )
2023-10-14 22:03:41 +01:00
8921d5027c
Add support for phi-1.0 ( #1093 )
...
* Add support for phi-1.0
* Update the readme.
2023-10-14 20:15:43 +01:00
29c7f2565d
Add some reinforcement learning example. ( #1090 )
...
* Add some reinforcement learning example.
* Python initialization.
* Get the example to run.
* Vectorized gym envs for the atari wrappers.
* Get some simulation loop to run.
2023-10-14 16:46:43 +01:00
9309cfc47d
Create a new curand instead of reseeding. ( #1089 )
2023-10-14 10:03:59 +01:00
a193bf5f60
Another gemm update. ( #1088 )
2023-10-14 09:36:52 +01:00
2c110ac7d9
Add the pooling operators to the pyo3 layer. ( #1086 )
2023-10-13 20:18:10 +01:00
75989fc3b7
Use an attention mask in the e5 padding case. ( #1085 )
2023-10-13 18:53:40 +01:00
07af87a1d8
Typos. ( #1084 )
2023-10-13 16:21:20 +01:00
eefad2b95f
Update to gemm 0.16.1 ( #1083 )
2023-10-13 06:40:20 +01:00
5e6df4a3f7
Update to gemm-0.16. ( #1082 )
...
* Update to gemm-0.16.
* Enable wasm-simd128.
2023-10-12 21:56:59 +01:00
7473c4ceca
Fix the npy read function and add some testing. ( #1080 )
2023-10-12 15:25:05 +02:00
c096f02411
Add a matvec cpu benchmark. ( #1076 )
2023-10-12 09:29:18 +01:00
e7560443e4
Convmixer example ( #1074 )
...
* Add a convmixer based example.
* Mention the model in the readme.
2023-10-11 19:51:10 +01:00
89b525b5e7
Convmixer ( #1073 )
...
* Only optimize float tensors.
* Use full tensors for zeros and ones.
* Add a benchmark for the matmul slowness.
* Add the convmixer model.
* Proper adaptive pooling.
2023-10-11 18:24:32 +01:00
37dbbff261
Use full tensors for zeros and ones ( #1071 )
...
* Only optimize float tensors.
* Use full tensors for zeros and ones.
2023-10-11 08:16:04 +01:00
9fea56d28e
Only optimize float tensors. ( #1069 )
2023-10-10 09:05:41 +01:00
bc3351bce4
Tracing for StableLM and quantized StableLM. ( #1068 )
2023-10-10 08:09:25 +02:00
b34d7f0248
Remove some unusued bits. ( #1067 )
2023-10-09 19:49:57 +01:00
4d04ac83c7
Override the repo for SDXL f16 vae weights. ( #1064 )
...
* Override the repo for SDXL f16 vae weights.
* Slightly simpler change.
2023-10-09 06:52:28 +01:00
392fe02fba
Move the common quantized-nn code to a shared module. ( #1063 )
2023-10-09 06:22:22 +01:00
59ab6d7832
Quantized version of StableLM. ( #1058 )
...
* Quantized version of StableLM.
* Adapt the stable-lm example to support quantizsed.
* Use some separate hub repo.
* Another repo name tweak.
2023-10-08 15:42:38 +01:00
783735cf22
Use softmax-last-dim where possible. ( #1057 )
2023-10-08 13:16:42 +01:00
9abeddd750
Make the cuda rng seedable. ( #1056 )
2023-10-08 09:32:36 +01:00
2e5fb0b251
Do not use the kv-cache on external key-value states. ( #1054 )
2023-10-07 22:37:19 +01:00