candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-15 10:26:33 +00:00

Files

Eric Buehler e2b6b367fa Add some fast Metal MLX SDPA kernels (#2584 )

* Add some fast Metal MLX SDPA kernels (#32)

* Sketch the sdpa kernel

* Add full sdpa kernel,

* Add test

* Add vectorized kernel for decoding

* Update tests

* Add some docs

* Fix sdpa_vector names

* Add softcapping for vectorized sdpa

* Add softcapping for full sdpa

* Add support for head dim 32, 96, 256

* Add support for head dim 32, 96, 256

* Update docs

* Add update notice

* Clippy and format

* Conditional compilation for bf16

* Use it in quantized llama

* Some review comments

* Use set_params!

* Remove unused

* Remove feature

* Fix metal sdpa for v stride

* Remove comma

* Add the dim method to layout and shape.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>

2024-11-05 09:28:00 +01:00

batch_norm.rs

Bug Fix: When converting a tensor to a variable, clone if the tensor is already a variable. (#2124 )

2024-04-29 11:21:53 +02:00

group_norm.rs

Move the test-utils bits to a shared place. (#619 )

2023-08-27 09:42:22 +01:00

kv_cache.rs

Add a RotatingKVCache. (#2493 )

2024-09-23 13:14:32 +02:00

layer_norm.rs

Enable the new layer-norm. (#2213 )