54e7fc3c97
Lint fixes introduced with Rust 1.83 ( #2646 )
...
* Fixes for lint errors introduced with Rust 1.83
* rustfmt
* Fix more lints.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com >
2024-11-28 23:00:21 +01:00
3159f91b90
20241118 docs ( #2629 )
...
* module docs
* varbuilder gguf docs
* add a link to gguf files
* small additonal mod doc titles
* safetensor docs
* more core docs
* more module docs in canlde_core
* 2 more link fixes
2024-11-19 04:07:07 +01:00
36cf54525d
Fix the fast bf16 gemm cublas kernels. ( #2274 )
...
* Use flash-attn in gemma.
* Fix for the fast bf16 cublas gemm.
* Fix some clippy lints.
* Fix another lint.
* Proper clippy fix.
2024-06-18 23:46:58 +02:00
6f0b807ffd
More efficient cuda implementation for ConvTranspose1d. ( #2211 )
...
* More efficient cuda implementation for ConvTranspose1d.
* Small tweak.
2024-05-24 11:05:43 +02:00
8a05743a21
Add StorageRef. ( #2113 )
...
* Add the storage-ref bits.
* Add the metal implementation.
2024-04-23 13:23:27 +02:00
53e5380bf6
Add a synchronize method to devices. ( #2055 )
...
* Add a synchronize method to devices.
* Metal version.
2024-04-14 16:32:55 +02:00
e6a5b82ba6
Fix the matmul layout for accelerate & mkl. ( #2011 )
...
* Fix the matmul layout for accelerate & mkl.
* Reduce the required precision for pow (because of accelerate).
* And a fix the gelu f16 test.
2024-04-04 19:18:03 +02:00
08c049def3
Improve the handling of matmul with squeezed layouts. ( #1998 )
...
* Improve the handling of matmul with squeezed layouts.
* Fix for the cuda backend.
* Revert the temporary fix.
2024-04-02 23:17:05 +02:00
665da30487
Backend refactoring. ( #1966 )
...
* Backend refactoring.
* Metal tweaks.
* Move the cudnn module.
2024-03-29 23:02:11 +01:00