mirror of https://github.com/huggingface/candle.git synced 2025-06-15 10:26:33 +00:00

Files

Laurent Mazare f7d5bf5b97 Faster kernels for quantized matmul on cuda (#2060 )

* Hook the quantized matmul cuda kernels.

* Add a (currently broken) test.

* Kernel fixes.

* Fix by transposing the rhs matrix.

* Add the q4-1 kernels.

* Proper block sizes.

* More details in the tests.

2024-04-15 08:32:47 +02:00

benches

Add benchmarks for qmatmul operations (#2048 )

2024-04-13 12:30:14 +02:00

examples

Move the tensor-tools binary in a separate crate. (#1969 )

2024-03-30 15:49:37 +01:00

src

Faster kernels for quantized matmul on cuda (#2060 )

2024-04-15 08:32:47 +02:00

tests

Add support for "sign" on tensors (#2012 )

2024-04-04 22:32:47 +02:00

Cargo.toml

feat(bf16): add cast support + tests for cast + bin ops (#1524 )

2024-01-11 15:49:13 +01:00

LICENSE

Refactor the hierarchy.

2023-06-27 11:57:27 +02:00

README.md

Refactor the hierarchy.

2023-06-27 11:57:27 +02:00

README.md

candle

Minimalist ML framework for Rust