Files
candle/candle-kernels
Laurent Mazare f7d5bf5b97 Faster kernels for quantized matmul on cuda (#2060)
* Hook the quantized matmul cuda kernels.

* Add a (currently broken) test.

* Kernel fixes.

* Fix by transposing the rhs matrix.

* Add the q4-1 kernels.

* Proper block sizes.

* More details in the tests.
2024-04-15 08:32:47 +02:00
..

candle-kernels

This crate contains CUDA kernels used from candle. Some of these implementations come from the dfdx crate.