Commit Graph

26 Commits

Author SHA1 Message Date
4b3bd79fbd Remove the embedding ops in favor of index-select. (#299)
* Remove the embedding ops in favor of index-select.

* Also remove the cuda kernels.
2023-08-02 05:42:11 +01:00
3eb2bc6d07 Softmax numerical stability. (#267)
* Softmax numerical stability.

* Fix the flash-attn test.
2023-07-28 13:13:01 +01:00
1a5416ec35 Rename exposed ops. 2023-07-26 12:43:19 +02:00
c97d51243c Add an abstract backprop op type (#240)
* Start adding the backprop op type.

* More backprop ops.

* Finish the backprop op.
2023-07-25 14:07:40 +01:00
fe87778223 Add the copy op. (#227)
* Add the copy op.

* Tweak some cat error messages.

* Handle the contiguous case in to_vec1.

* Fast variant for to_vec2.

* Add add a faster to_vec3 variant.
2023-07-23 18:06:47 +01:00
52c5d8c087 Add the gather op. (#219)
* Start adding gather.

* Gather cpu implementation + use in simple training.

* Add scatter_add for the gradient of gather.

* Simple cpu implementation of scatter_add.

* Use gather in the simple-training backprop.
2023-07-22 07:21:28 +01:00
27174a82aa Start adding index-add. 2023-07-21 20:12:48 +01:00
5cc843550d Add binary and ternary custom ops. (#217) 2023-07-21 17:29:50 +01:00
a6bcdfb269 Custom ops with a single argument (#214)
* Add the CustomOp1 trait.

* Add an example of custom op.

* Polish the custom op example.

* Add some backward pass test for custom ops.
2023-07-21 15:18:05 +01:00
410654525f Refactor the reduce ops in order to introduce argmin/argmax. (#212)
* Refactor the reduce ops in order to introduce argmin/argmax.

* Clippy fixes.

* Use the newly introduced argmax.

* Fix the strided case.

* Handle the non-contiguous case.
2023-07-21 11:41:08 +01:00
4845d5cc64 More realistic training setup. (#210)
* More realistic training setup.

* Compute the model accuracy.

* Very inefficient backprop for index select.

* More backprop.

* Fix some backprop issues.

* Backprop fix.

* Another broadcasting backprop fix.

* Better backprop for reducing ops.

* Training again.

* Add some gradient tests.

* Get the training to work.
2023-07-20 18:25:41 +01:00
fa08fb3126 Add the index-select op. (#209)
* Add the index-select op.

* Cpu implementation of index-select.

* Add the cpu implementation for index-select.
2023-07-20 14:01:03 +01:00
2a8f28d687 Op refactor (#208)
* Add the binary and unary op enums to factorize some code.

* Bugfix.
2023-07-20 12:28:45 +01:00
e9c052bf94 Add the comparison operations. (#207)
* Add the comparison operations.

* Add the helper functions on the tensor side.

* More cmp operations.

* Cpu implementation for the comparison operations.
2023-07-20 09:40:31 +01:00
cb687b4897 Add some more developed training examples. (#199)
* Use contiguous tensors for variables.

* Sketch the mnist example.

* Start adding the reduce ops.

* Renaming.

* Refactor the reduce operations.

* Bugfix for the broadcasting vectorization.
2023-07-19 15:37:52 +01:00
3307db204a Mklize more unary ops. (#191)
* Mklize more unary ops.

* Even more unary ops.
2023-07-18 13:32:49 +01:00
ff61a42ad7 Use mkl to accelerate binary ops. (#190)
* Vectorized binary ops with mkl.

* Improve the binary op mkl support.

* Push the support for mkl binary ops.

* Proper vectorization of binary ops.

* Proper mkl'isation when broadcasting binary ops.
2023-07-18 12:04:39 +01:00
d73df74cb2 Preliminary support for mkl based gelu. (#187)
* Preliminary support for mkl based gelu.

* Add the vectorized function for unary ops.

* Get the mkl specialized gelu to work.
2023-07-18 07:48:48 +01:00
28e1c07304 Process unary functions per block (#180)
* Process unary functions per block.

* Add some inline hints.
2023-07-17 10:22:33 +01:00
270997a055 Add the elu op. (#113) 2023-07-09 21:56:31 +01:00
3aac1047fe Sketch the conv1d op. 2023-07-04 10:52:34 +01:00
8ad47907f3 Add the kernels. 2023-06-30 10:26:56 +01:00
c9c468e1aa Use Map2 for binary ops. 2023-06-29 10:09:15 +01:00
8ad03a5fb6 Use Map1 on unary ops. 2023-06-29 09:37:38 +01:00
1ce3843cab Add the relu op. 2023-06-28 09:38:54 +01:00
d7f729fb8f Refactor the hierarchy. 2023-06-27 11:57:27 +02:00