candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-17 02:58:50 +00:00

Author	SHA1	Message	Date
Laurent Mazare	6f0b807ffd	More efficient cuda implementation for ConvTranspose1d. (#2211 ) * More efficient cuda implementation for ConvTranspose1d. * Small tweak.	2024-05-24 11:05:43 +02:00
Laurent Mazare	8a05743a21	Add StorageRef. (#2113 ) * Add the storage-ref bits. * Add the metal implementation.	2024-04-23 13:23:27 +02:00
Laurent Mazare	53e5380bf6	Add a synchronize method to devices. (#2055 ) * Add a synchronize method to devices. * Metal version.	2024-04-14 16:32:55 +02:00
Laurent Mazare	e6a5b82ba6	Fix the matmul layout for accelerate & mkl. (#2011 ) * Fix the matmul layout for accelerate & mkl. * Reduce the required precision for pow (because of accelerate). * And a fix the gelu f16 test.	2024-04-04 19:18:03 +02:00
Laurent Mazare	08c049def3	Improve the handling of matmul with squeezed layouts. (#1998 ) * Improve the handling of matmul with squeezed layouts. * Fix for the cuda backend. * Revert the temporary fix.	2024-04-02 23:17:05 +02:00
Laurent Mazare	665da30487	Backend refactoring. (#1966 ) * Backend refactoring. * Metal tweaks. * Move the cudnn module.	2024-03-29 23:02:11 +01:00