c18a856e76
Add the rounding operators. ( #1030 )
...
* Add the rounding operators.
* Avoid tracking gradients for the rounding operations.
* Add some rounding tests.
2023-10-04 17:58:44 +01:00
d7e48234d4
Add an erf based gelu op ( #900 )
...
* Erf based gelu.
* Add the erf backed gelu.
* Test the new gelu op (which is not gelu_new).
2023-09-19 19:54:28 +01:00
9a5c7db91a
Add support for i64 ( #563 )
...
* Add the i64 dtype.
* Adapt the cuda kernels.
2023-08-23 10:42:19 +01:00
c950a5c6b1
Cuda support for the mnist training. ( #277 )
...
* Cuda support for the mnist training.
* min/max fix + testing.
* Add the argmin/argmax tests.
* More cuda support for argmin/argmax.
* Cuda kernels for argmin and argmax.
2023-07-29 19:48:04 +01:00
536c5e702e
Cuda kernels for fast min/max reductions ( #203 )
...
* Add the min/max cuda kernels.
* Better integration of the cuda kernels.
2023-07-19 18:12:27 +01:00
ec79fc43f2
Add the bf16 cuda kernels.
2023-06-29 23:12:02 +01:00
d7f729fb8f
Refactor the hierarchy.
2023-06-27 11:57:27 +02:00