Fast kernels for rotary embeddings. (#1928)

* Fast kernels for rotary embeddings.

* Add a test for the fast CPU kernel.

* Rope cuda bindings.

* Cuda kernel.

* Metal kernel (part 1).

* Cuda kernels.

* Finish the metal kernel.

* Use the new kernels in the quantized example.

* Fix warning.
This commit is contained in:
Laurent Mazare
2024-03-24 22:48:52 +01:00
committed by GitHub
parent cf7d7fcf2f
commit 1b98f84a2b
8 changed files with 375 additions and 26 deletions

View File

@ -12,6 +12,7 @@ pub mod loss;
pub mod ops;
pub mod optim;
pub mod rnn;
pub mod rotary_emb;
pub mod sequential;
pub mod var_builder;
pub mod var_map;