5aa1a65dab
Add quantized Starling, fix open-chat prompt ( #1393 )
...
* Add quantized Starling, fix open-chat prompt
* Fix open-chat and starling prompts
2023-12-02 16:47:19 +00:00
2ca086939f
Put back affine strided tests
2023-11-30 11:40:39 +01:00
4349ff1fc2
Starting to fix some tests.
...
Few fixes.
Going back on remote metal-rs.
Reusing a single buffer (for now) to speed things up.
Adding some half kernels.
All tests are panicking instead of random failure.
Putting back f16 index select.
Add erf.
Working version for llama2-c.
Fixes + cache compute_pipeline_state.
BF16 metal fix.
Remove some prints.
new_owned -> new()..to_owned().
Better batched matmul.
Metal operational.
Reuse buffers on our own reference counts.
Tmp gemm.
Revert "Tmp gemm."
This reverts commit c65f68e988
.
Interleave committing.
Speeding up copies using blit.
Fmt.
Fmt.
Remove the assert!
Fmt all.
Fixes after big rebase.
Add softmax for half and bfloat + tests
Fixing Llama example + accumulate softmax in float.
2023-11-30 11:30:31 +01:00
7c3cfd1086
Use the llama weight names for the Yi example. ( #1381 )
2023-11-27 20:42:52 +00:00
e2eb6590ed
Merge pull request #1323 from huggingface/metal3
...
Adding the test scaffolding.
2023-11-27 13:06:01 +01:00
481c45d78d
Add a basic implementation for slice-assign. ( #1377 )
2023-11-26 17:31:22 +00:00
14a2bdc062
Small tweak: remove the macro usage for the range indexing trait. ( #1376 )
2023-11-26 16:30:59 +00:00
bfa7c8fc01
Implement the module trait directly for QMatMul. ( #1372 )
2023-11-25 10:09:45 +00:00
762e996ce6
Distibert ( #1366 )
...
* add bce with logit loss
* add bce with logit loss
* remove imports
* fix tiny bug
* add test documentation and refactor function
* fix test cases and formatting
* distilbet files
* Apply various cleanups.
* More cleanups.
* More polish.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-11-24 15:09:14 +00:00
ca19a9af62
Fix linspace implementation ( #1358 )
...
* Fix linspace implementation
`steps` should be strictly greater than 1 to make it consistent with the context.
* Handle steps == 0 and steps == 1.
* Fix rustfmt.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-11-23 07:35:13 +00:00
ec23427d60
Ensure to copy data to cpu before iterating. ( #1360 )
2023-11-23 07:24:25 +00:00
f83e14f68d
Add candle-lora transformers to readme? ( #1356 )
...
* Demonstrate lora transformers in readme
* Shorten readme
2023-11-21 17:54:24 +00:00
c7e613ab5e
Update the readme. ( #1354 )
2023-11-21 09:38:27 +00:00
8f63f68289
Fix the kalosm link ( #1353 )
2023-11-21 06:18:14 +01:00
1edc3ddf24
Allowing feature metal to compile.
2023-11-20 20:17:16 +01:00
b380657bfe
Merge pull request #1309 from huggingface/metal2
...
Adding the actual backend
2023-11-20 17:24:01 +01:00
60f624a902
Moving tests around.
2023-11-20 16:17:19 +01:00
8d6c6de8e0
Missing new test.
2023-11-20 14:38:35 +01:00
7ec345c2eb
Adding the test scaffolding.
2023-11-20 14:38:35 +01:00
671fc29b36
Fmt.
2023-11-20 14:38:20 +01:00
dc64adb8e4
Fixing cos_f16 test.
2023-11-20 14:17:07 +01:00
c66e5d4716
Fix comments.
2023-11-20 14:13:44 +01:00
bd3b243725
Update candle-metal-kernels/Cargo.toml
2023-11-20 14:12:57 +01:00
2813fb5dbc
Cleanup fixed a few ops removed debugging scaffolding.
2023-11-20 14:12:57 +01:00
7cfffcac10
Debugging rope.
2023-11-20 14:12:57 +01:00
38de52bc4b
Fixed matmul (display still broken without casting back to CPU first? )
2023-11-20 14:12:57 +01:00
d46670f7c0
Tmp state.
2023-11-20 14:12:57 +01:00
f710fab02e
Fixing the kernels + launches to make them faster.
...
Cool work by @ivarflakstad
Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com >
2023-11-20 14:12:57 +01:00
f82bf2d915
Adding indexing.
...
Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com >
2023-11-20 14:12:57 +01:00
df6814f34e
Refactor to simplify our lives for settings the params in the encoder.
2023-11-20 14:12:57 +01:00
39406a6721
Adding the actual backend
2023-11-20 14:12:56 +01:00
976ad9f9c2
Remove tracing.
2023-11-20 14:12:29 +01:00
a4c4a56429
Metal part 1 - Scaffolding for metal.
2023-11-20 14:12:05 +01:00
f49bf6a81d
Fix OpenChat 3.5 tokenizer ( #1347 )
2023-11-19 18:48:04 +00:00
992a788da1
Add OpenChat 3.5 to quantized examples ( #1346 )
...
* Add OpenChat to quantized examples
* Add chat prompt
* Make the openchat example more in line with the other models.
* Fix a typo.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-11-19 18:28:52 +00:00
8d8f48c60c
feat: add test for individual onnx ops ( #1332 )
...
* feat: add test for individual onnx ops
* fix: prefer consts when possible
* feat: add move op tests
2023-11-19 08:17:09 +01:00
d31f11035f
Support for CumSum in ONNX models. ( #1340 )
2023-11-17 22:03:40 +00:00
9ab3f9729f
Use the whisper-v3 tokenizer now that it has been added. ( #1337 )
...
* Use the whisper-v3 tokenizer now that it has been added.
* Use the appropriate nospeech token.
2023-11-16 22:10:31 +00:00
a1f41ab37b
feat: adds reset_kv_cache ( #1335 )
2023-11-16 21:17:42 +00:00
92a05b51cf
fix: address clippy 0.1.74 issues ( #1336 )
...
- clippy::needless-borrows-for-generic-args
- clippy::reserve-after-initialization
2023-11-16 21:15:22 +00:00
c6763e3b41
Add a simple implementation of cumsum. ( #1334 )
...
* Add a simple implementation of cumsum.
* Add another test.
2023-11-15 21:11:15 +00:00
347e31c9ff
Add the tril/triu/eye ops. ( #1333 )
...
* Add tril/triu/eye.
* Revert the metal crate tweak.
2023-11-15 20:34:37 +00:00
f4fcf60900
Update readme.md ( #1322 )
...
Updating the readme to coincide with other examples. If you try to run it as previously written, you will get a "cannot find the path specified" error.
2023-11-12 09:46:19 +00:00
12561b31d3
Fix pose estimation image path ( #1326 )
2023-11-12 09:45:26 +00:00
a209ce8ceb
Update for 0.3.1. ( #1324 )
2023-11-11 18:48:52 +00:00
f1e678b39c
Mention the Yi-6b/Yi-34b models in the readme. ( #1321 )
2023-11-11 12:39:11 +01:00
a007f8fdb4
Add the Yi-6b and Yi-34b models. ( #1320 )
...
* Add the Yi-6b model.
* Add the 34b model.
* Add the yi example.
* Fix the weight file names.
2023-11-11 12:00:48 +01:00
2341aa079e
Fix quantized zephyr chat prompt ( #1314 ) ( #1317 )
...
* Fix quantized zephyr chat prompt (#1314 )
* Avoid using a mutable variable.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com >
2023-11-11 09:14:12 +01:00
9e666d4229
Add the var method. ( #1315 )
...
* Add the var method.
* Add a test.
2023-11-10 22:47:57 +01:00
1b12142a02
Add min to buckets in relative_position_bucket ( #1312 )
2023-11-10 11:57:25 +01:00