064ba17bd7
Remove print.
2023-12-18 11:04:16 +01:00
e8ee253ee0
Missing cast.
2023-12-18 11:01:18 +01:00
8bd3d6b94b
Index add.
2023-12-18 10:46:01 +01:00
6a3ca7da0c
Scatter add.
2023-12-18 10:32:22 +01:00
586b6f6fff
Adding gather op.
2023-12-17 23:34:12 +01:00
e4b0cc59f5
Adding CMP
2023-12-17 22:32:25 +01:00
0a6e0a8c9a
Implement randn (CPU-> device)
2023-12-17 19:09:08 +01:00
972903021c
Finish reduce kernels.
2023-12-17 19:07:00 +01:00
6bc92e63cb
Addressing a lot of comments.
2023-12-15 13:06:04 +01:00
aa04015098
Remove unwrap()
.
2023-12-15 12:23:28 +01:00
8b5059e951
Remove test file.
2023-12-15 11:55:30 +01:00
26540641c1
Renamed all kernel names.
2023-12-15 11:24:47 +01:00
34d83377f6
Better error message on older macos
2023-12-15 11:18:54 +01:00
77197379cc
More cleanup.
2023-12-15 11:17:05 +01:00
916a8c5464
Revert candle-transformers.
2023-12-15 11:15:21 +01:00
243e83f2b9
Adding a bunch of docs !
...
Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com >
2023-12-15 11:03:05 +01:00
cf27868b57
More cleanup.
2023-12-15 01:44:22 +01:00
40c3e1bd5a
cleanup.
2023-12-15 01:41:14 +01:00
ece4c69a68
Fixing softmax.
2023-12-15 01:35:08 +01:00
4eeaf205d6
Fix softmax for long sequences (missing barrier).
2023-12-14 19:37:03 +01:00
f419a38e1a
Fix use resource.
2023-12-14 16:52:37 +01:00
361f2ad2af
Working with merging encoders and using fences.
2023-12-14 16:05:33 +01:00
931432ed55
Fixing tests + matmul from MFA
2023-12-13 16:58:36 +01:00
0404a3eb5b
Removed MPSMatrix entirely (buggy).
2023-12-13 16:21:48 +01:00
a9d0657432
Better version ?
2023-12-13 12:09:20 +01:00
87dc559817
Lots of updates including some stack of command buffers.
2023-12-12 17:41:56 +01:00
da0af3cb3e
Merge pull request #1408 from jbochi/metal_gelu2
...
Fix NaN errors for Gelu in Metal
2023-12-09 19:46:36 +01:00
803ac8405b
Put back affine strided tests
...
Co-Authored-By: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com >
2023-12-06 17:04:15 +01:00
6e25822d4f
Fix gelu for large x
2023-12-06 09:59:44 -05:00
2ca086939f
Put back affine strided tests
2023-11-30 11:40:39 +01:00
4349ff1fc2
Starting to fix some tests.
...
Few fixes.
Going back on remote metal-rs.
Reusing a single buffer (for now) to speed things up.
Adding some half kernels.
All tests are panicking instead of random failure.
Putting back f16 index select.
Add erf.
Working version for llama2-c.
Fixes + cache compute_pipeline_state.
BF16 metal fix.
Remove some prints.
new_owned -> new()..to_owned().
Better batched matmul.
Metal operational.
Reuse buffers on our own reference counts.
Tmp gemm.
Revert "Tmp gemm."
This reverts commit c65f68e988
.
Interleave committing.
Speeding up copies using blit.
Fmt.
Fmt.
Remove the assert!
Fmt all.
Fixes after big rebase.
Add softmax for half and bfloat + tests
Fixing Llama example + accumulate softmax in float.
2023-11-30 11:30:31 +01:00
7c3cfd1086
Use the llama weight names for the Yi example. ( #1381 )
2023-11-27 20:42:52 +00:00
e2eb6590ed
Merge pull request #1323 from huggingface/metal3
...
Adding the test scaffolding.
2023-11-27 13:06:01 +01:00
481c45d78d
Add a basic implementation for slice-assign. ( #1377 )
2023-11-26 17:31:22 +00:00
14a2bdc062
Small tweak: remove the macro usage for the range indexing trait. ( #1376 )
2023-11-26 16:30:59 +00:00
bfa7c8fc01
Implement the module trait directly for QMatMul. ( #1372 )
2023-11-25 10:09:45 +00:00
762e996ce6
Distibert ( #1366 )
...
* add bce with logit loss
* add bce with logit loss
* remove imports
* fix tiny bug
* add test documentation and refactor function
* fix test cases and formatting
* distilbet files
* Apply various cleanups.
* More cleanups.
* More polish.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-11-24 15:09:14 +00:00
ca19a9af62
Fix linspace implementation ( #1358 )
...
* Fix linspace implementation
`steps` should be strictly greater than 1 to make it consistent with the context.
* Handle steps == 0 and steps == 1.
* Fix rustfmt.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-11-23 07:35:13 +00:00
ec23427d60
Ensure to copy data to cpu before iterating. ( #1360 )
2023-11-23 07:24:25 +00:00
f83e14f68d
Add candle-lora transformers to readme? ( #1356 )
...
* Demonstrate lora transformers in readme
* Shorten readme
2023-11-21 17:54:24 +00:00
c7e613ab5e
Update the readme. ( #1354 )
2023-11-21 09:38:27 +00:00
8f63f68289
Fix the kalosm link ( #1353 )
2023-11-21 06:18:14 +01:00
1edc3ddf24
Allowing feature metal to compile.
2023-11-20 20:17:16 +01:00
b380657bfe
Merge pull request #1309 from huggingface/metal2
...
Adding the actual backend
2023-11-20 17:24:01 +01:00
60f624a902
Moving tests around.
2023-11-20 16:17:19 +01:00
8d6c6de8e0
Missing new test.
2023-11-20 14:38:35 +01:00
7ec345c2eb
Adding the test scaffolding.
2023-11-20 14:38:35 +01:00
671fc29b36
Fmt.
2023-11-20 14:38:20 +01:00
dc64adb8e4
Fixing cos_f16 test.
2023-11-20 14:17:07 +01:00
c66e5d4716
Fix comments.
2023-11-20 14:13:44 +01:00