Commit Graph

7 Commits

Author SHA1 Message Date
037b41c9dc Cuda conv transpose (#645)
* Cuda kernel for conv-transpose.

* Fix the cuda kernel.

* Fix the tests.
2023-08-28 20:58:49 +01:00
ca318a6ec7 Add to the cuda example a reproduction of the issue. (#579)
* Add to the cuda example a reproduction of the issue.

* Tweak.

* Add a test using non-square matrixes.

* Fix the conv2d kernel.

* Display the error.

* And tweak the comment.
2023-08-24 12:07:31 +01:00
c84883ecf2 Add a cuda kernel for upsampling. (#441)
* Add a cuda kernel for upsampling.

* Update for the latest tokenizers version.
2023-08-14 13:12:17 +01:00
a094dc503d Add a cuda kernel for avg-pool2d. (#440)
* Add a cuda kernel for avg-pool2d.

* Avoid running out of bounds.

* Finish wiring the avg pool kernel + add some testing.

* Support for max-pool + testing.
2023-08-14 12:32:05 +01:00
34f4b3187e Add a naive conv2d cuda kernel. (#438)
* Add a naive conv2d cuda kernel.

* Proper conv2d support on the rust side.

* Conv1d testing on gpu.

* Also use the test on gpus.

* Fix the clean-ptx target.
2023-08-14 10:34:42 +01:00
c187f347bf Make it easier to use whisper samples from the repo. (#112)
* Make it easier to use samples from the repo.

* Use f32 for accumulation in the f16/bf16 kernels.
2023-07-08 18:48:27 +01:00
eb64ad0d4d Cuda kernel for the conv1d op (#111)
* Boilerplate code for conv1d.

* Boilerplate code for conv1d.

* More boilerplate for conv1d.

* Conv1d work.

* Get the conv1d cuda kernel to work.

* Conv1d support when no batch dim.
2023-07-08 18:13:25 +01:00