Commit Graph

5 Commits

Author SHA1 Message Date
871efc0307 Bugfix for the conv2d cpu kernel. (#820) 2023-09-11 23:11:27 +01:00
98d1242b8f im2col based conv2d (#802)
* im2col implementation for conv2d.

* Fix for the im2col implementation to match the current conv2d.

* Small optimization.

* Add a cuda kernel.

* Handle arbitrary layouts.

* Im2Col cuda code.
2023-09-10 21:02:42 +01:00
4f18180fc7 Bugfix so that im2col produce the same results as conv2d. (#801) 2023-09-10 16:59:46 +01:00
559944146f Add an im2col based benchmark. (#800)
* Add an im2col based benchmark.

* Reshape the final result.
2023-09-10 16:56:28 +01:00
1c9e5394a5 Add a custom softmax implementation. (#744)
* Add a custom softmax implementation.

* Add softmaxlastdim to the benchmarks.

* And add a test.

* Support more dtypes.

* Polish the code.

* Use the slow implementation on cuda.

* Add a todo for the cuda kernel.
2023-09-05 14:20:23 +01:00