mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Files

Laurent Mazare f7d5bf5b97 Faster kernels for quantized matmul on cuda (#2060 )

* Hook the quantized matmul cuda kernels.

* Add a (currently broken) test.

* Kernel fixes.

* Fix by transposing the rhs matrix.

* Add the q4-1 kernels.

* Proper block sizes.

* More details in the tests.

2024-04-15 08:32:47 +02:00

src

Faster kernels for quantized matmul on cuda (#2060 )

2024-04-15 08:32:47 +02:00

build.rs

Ensure that the kernels get rebuilt on cuh changes. (#1954 )

2024-03-28 06:56:48 +01:00

Cargo.toml

Bumping the version number to 0.5.0. (#2009 )

2024-04-04 17:48:45 +02:00

README.md

Revert "Add the layer norm files. (#222 )" (#223 )

2023-07-22 16:51:11 +01:00

README.md

candle-kernels

This crate contains CUDA kernels used from candle. Some of these implementations come from the dfdx crate.