mirror of https://github.com/huggingface/candle.git synced 2025-06-15 10:26:33 +00:00

Files

Laurent Mazare cd29c7ccd4 More ggml cuda kernels (#1977 )

* Add more cuda kernels for quantized matmul.

* Add the vec-dot bits.

* Expose the quantized matmul-vec kernels.

* Also include the quantize-q8-1 kernel.

* Glue code for the q8-1 quantization.

* mm-vec product via q8-1 quantization.

* Add a test.

* Add a mm test.

* Get the test to return some sensible results.

* Also test dmmv.

* Fix the launch params.

* Allow for tweaking the force_dmmv parameter while it's experimental.

2024-04-01 00:15:48 +02:00

src

More ggml cuda kernels (#1977 )

2024-04-01 00:15:48 +02:00

build.rs

Ensure that the kernels get rebuilt on cuh changes. (#1954 )

2024-03-28 06:56:48 +01:00

Cargo.toml

Bump the crate versions to 0.4.2. (#1821 )

2024-03-08 22:01:51 +01:00

README.md

Revert "Add the layer norm files. (#222 )" (#223 )

2023-07-22 16:51:11 +01:00

README.md

candle-kernels

This crate contains CUDA kernels used from candle. Some of these implementations come from the dfdx crate.