Files
candle/candle-core
Laurent Mazare f7d5bf5b97 Faster kernels for quantized matmul on cuda (#2060)
* Hook the quantized matmul cuda kernels.

* Add a (currently broken) test.

* Kernel fixes.

* Fix by transposing the rhs matrix.

* Add the q4-1 kernels.

* Proper block sizes.

* More details in the tests.
2024-04-15 08:32:47 +02:00
..
2023-06-27 11:57:27 +02:00
2023-06-27 11:57:27 +02:00

candle

Minimalist ML framework for Rust