mirror of
https://github.com/huggingface/candle.git
synced 2025-06-17 11:08:52 +00:00

* Trying out a custom RmsNorm cuda kernel. * CPU implementation for rms-norm. * Cuda wrappers. * Add some validation. * Add some testing. * More testing.
candle-kernels
This crate contains CUDA kernels used from candle. Some of these implementations come from the dfdx crate.