0cb0bd1dfa
Add some metal gemm benchark. ( #2471 )
...
* Add some metal gemm benchark.
* More benchmarks.
2024-09-11 22:52:37 +02:00
5635650d38
Integrate the MLX gemm kernels ( #2468 )
...
* Include the MLX gemm kernels.
* Clippy lints.
* Export the gemm_f32 kernel.
* Add the f16/bf16 variants.
* Add the initial dispatch code.
* More plugging of the mlx kernels.
* Add a currently broken test.
* Tweaks.
* Bugfix + get the tests to pass.
* Enable the gemm bf16 tests.
* Add some randomized tests.
* Update candle-metal-kernels/src/lib.rs
Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com >
* More fixes.
* More clippy fixes.
---------
Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com >
2024-09-11 16:56:48 +02:00
6070278a31
Bump the version to 0.6.1. ( #2438 )
2024-08-22 09:23:52 +02:00
f65e90e7ef
Bump the crate version. ( #2248 )
2024-06-05 15:49:15 +02:00
89f53b9d7b
Bump the version number to 0.5.1. ( #2155 )
...
* Bump the version number to 0.5.1.
* Fix clippy lints for 1.78.
* More clippy fixes.
2024-05-03 11:17:05 +02:00
f76bb7794a
Bumping the version number to 0.5.0. ( #2009 )
2024-04-04 17:48:45 +02:00
e7fc1daa21
Bump the crate versions to 0.4.2. ( #1821 )
2024-03-08 22:01:51 +01:00
5e526abc8c
Bump the version number to 0.4.1. ( #1768 )
...
* Fix the block size for some cuda kernels.
* Bump the version number to 0.4.1.
2024-02-27 14:19:59 +01:00
a83ca2ece0
Bump the crate version to 0.4.0. ( #1658 )
2024-02-04 19:08:01 +01:00
402349d120
feat(bf16): add cast support + tests for cast + bin ops ( #1524 )
2024-01-11 15:49:13 +01:00
d35f0a1376
Bump the crate version to 0.3.3. ( #1490 )
2023-12-28 13:38:30 +01:00
9fc210fae8
Merge pull request #1318 from huggingface/metal4
...
Starting to fix some tests.
2023-12-20 15:37:31 +01:00
94817dac56
Bump the crate version to 0.3.2. ( #1452 )
2023-12-17 05:34:53 -06:00
931432ed55
Fixing tests + matmul from MFA
2023-12-13 16:58:36 +01:00
c66e5d4716
Fix comments.
2023-11-20 14:13:44 +01:00
bd3b243725
Update candle-metal-kernels/Cargo.toml
2023-11-20 14:12:57 +01:00
7cfffcac10
Debugging rope.
2023-11-20 14:12:57 +01:00
38de52bc4b
Fixed matmul (display still broken without casting back to CPU first? )
2023-11-20 14:12:57 +01:00
d46670f7c0
Tmp state.
2023-11-20 14:12:57 +01:00
39406a6721
Adding the actual backend
2023-11-20 14:12:56 +01:00