mirror of https://github.com/huggingface/candle.git synced 2025-06-16 02:38:10 +00:00

Files

Eric Buehler e2b6b367fa Add some fast Metal MLX SDPA kernels (#2584 )

* Add some fast Metal MLX SDPA kernels (#32)

* Sketch the sdpa kernel

* Add full sdpa kernel,

* Add test

* Add vectorized kernel for decoding

* Update tests

* Add some docs

* Fix sdpa_vector names

* Add softcapping for vectorized sdpa

* Add softcapping for full sdpa

* Add support for head dim 32, 96, 256

* Add support for head dim 32, 96, 256

* Update docs

* Add update notice

* Clippy and format

* Conditional compilation for bf16

* Use it in quantized llama

* Some review comments

* Use set_params!

* Remove unused

* Remove feature

* Fix metal sdpa for v stride

* Remove comma

* Add the dim method to layout and shape.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>

2024-11-05 09:28:00 +01:00

src

Add some fast Metal MLX SDPA kernels (#2584 )

2024-11-05 09:28:00 +01:00

tests

Soft Non-Maximum Suppression (#2400 )

2024-08-10 07:57:52 +02:00

Cargo.toml

Metavoice - first cut (#1717 )

2024-03-02 18:50:01 +01:00

README.md

Add some missing readme files. (#304 )

2023-08-02 10:57:12 +01:00

README.md

candle-transformers