Metal Unary: Add benchmarks and process kernels in a tile based fashion (#2056)

* add basic unary bench for sqrt

* process unary commands in tiles of 4

* re-enable all benchmarks

* rename helper to unary

* modify approach to split up tiled and non-tiled operations

* undo bench ignore for other tests

* update tile size to 2

* only perform the optimization on the contiguous even numbered element case
This commit is contained in:
Thomas Santerre
2024-04-20 18:10:33 -04:00
committed by GitHub
parent 587ee3bb6f
commit 0067fe00a8
6 changed files with 380 additions and 184 deletions

View File

@ -3,6 +3,7 @@ pub(crate) mod conv_transpose2d;
pub(crate) mod matmul;
pub(crate) mod qmatmul;
pub(crate) mod random;
pub(crate) mod unary;
pub(crate) mod where_cond;
use candle_core::{Device, Result};