3071ea6c3e
Use the new hub helper function. ( #1484 )
2023-12-26 09:44:30 +01:00
37c539f2b7
Helper function to load sharded safetensors files ( #1481 )
...
* Fix the quantized mistral example.
* Add a helper function to load sharded safetensors weights.
* Use the sharded loader.
2023-12-25 21:49:21 +01:00
eae3a20d43
Merge pull request #1479 from huggingface/upsample_metal
...
Adding upsample_nearest_2d.
2023-12-25 14:25:53 +01:00
13a5d15ebc
Adding upsample_nearest_2d.
2023-12-25 14:25:19 +01:00
1505d85276
Merge pull request #1461 from huggingface/metal-conv
...
Adding the convolutions (1d + 2d) to candle on metal.
2023-12-25 12:48:09 +01:00
95e18ef675
Fixing matmul for convolutions.
2023-12-25 12:29:34 +01:00
7135791dd5
Fix the quantized mistral example. ( #1478 )
2023-12-25 09:31:24 +01:00
88589d8815
Support mistral instruct v0.2. ( #1475 )
...
* Support mistral instruct v0.2.
* Use the safetensors model now that they are available.
2023-12-23 16:18:49 +01:00
5b35fd0fcf
MMLU evaluation for Phi. ( #1474 )
...
* MMLU evaluation for Phi.
* Improve the evaluation.
2023-12-23 15:28:36 +01:00
ba1fae590e
Validate the kernel size in pooling ops. ( #1473 )
...
* Validate the kernel size in pooling ops.
* Revert the changes to basics.
2023-12-23 11:19:22 +01:00
78d982e1bd
Fix for mamba 2.8b. ( #1472 )
2023-12-23 11:01:39 +01:00
d8b9a727fc
Support different mamba models. ( #1471 )
2023-12-23 10:46:02 +01:00
ceb78d3e28
Sketch the minimal mamba example. ( #1465 )
...
* Sketch the minimal mamba example.
* Fix rustfmt.
* Forward pass for mamba.
* Finish the forward pass.
* Inference fixes.
* Bugfixes.
* More fixes.
* Add a readme.
2023-12-22 00:28:50 +01:00
f6408a3779
feat: add clear_kv_cache to mistral and qmistral models ( #1464 )
2023-12-21 21:19:19 +01:00
10d94659c3
Adding the convolutions (1d + 2d) to candle on metal.
2023-12-21 10:39:24 +01:00
563a79afa1
make fn name generic ( #1459 )
...
Co-authored-by: Ubuntu <danielclough@users.noreply.github.com >
2023-12-21 02:16:31 +01:00
8ede5f4210
add fn config_chat_ml ( #1458 )
...
* add fn config_chat_ml
* Add a link to the original config.
---------
Co-authored-by: Ubuntu <danielclough@users.noreply.github.com >
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-12-20 21:03:24 +01:00
9fc210fae8
Merge pull request #1318 from huggingface/metal4
...
Starting to fix some tests.
2023-12-20 15:37:31 +01:00
9b5e4843a6
Optimizing decode matmul (Phi at 28tok/s on M3).
...
Adding some benchmark in order to help checking out matmul performance.
2023-12-20 09:54:19 +01:00
03641293ee
Clippy pass.
2023-12-18 15:22:43 +01:00
064ba17bd7
Remove print.
2023-12-18 11:04:16 +01:00
e8ee253ee0
Missing cast.
2023-12-18 11:01:18 +01:00
8bd3d6b94b
Index add.
2023-12-18 10:46:01 +01:00
6a3ca7da0c
Scatter add.
2023-12-18 10:32:22 +01:00
96f1a28e39
Add a simple full method. ( #1455 )
...
* Add a simple implementation of the full method.
* Add the docstring.
2023-12-17 20:15:57 -05:00
586b6f6fff
Adding gather op.
2023-12-17 23:34:12 +01:00
e4b0cc59f5
Adding CMP
2023-12-17 22:32:25 +01:00
0a6e0a8c9a
Implement randn (CPU-> device)
2023-12-17 19:09:08 +01:00
972903021c
Finish reduce kernels.
2023-12-17 19:07:00 +01:00
94817dac56
Bump the crate version to 0.3.2. ( #1452 )
2023-12-17 05:34:53 -06:00
1e86717bf2
Fix a couple typos ( #1451 )
...
* Mixtral quantized instruct.
* Fix a couple typos.
2023-12-17 05:20:05 -06:00
c630622a07
Expose AdamW parameters ( #1449 )
...
* Expose AdamW parameters
* Use reference
2023-12-16 18:41:56 -06:00
c4cfcf1539
Tweak the readme for phi and the default sample length. ( #1450 )
2023-12-16 18:11:36 -06:00
1782e93de6
Mixtral quantized instruct. ( #1447 )
2023-12-16 16:16:39 -06:00
cfdf9640a3
Readme tweaks. ( #1446 )
2023-12-16 06:23:12 -06:00
e12cbfd73b
Update the readme to mention mixtral. ( #1443 )
2023-12-15 19:29:03 -06:00
30a958e5dd
Quantized mixtral model ( #1442 )
...
* Add the Mixtral model.
* Add more of the mixtral layers.
* Add the final layers for mixtral.
* Sketch the expert selection.
* Add some expert routing logic.
* Hopefully finish the routing logic for mixtral.
* Add the mixtral example.
* Fix the weight filenames.
* Bugfix.
* Another fix.
* Yet another fix + remove the unused pragma.
* Shape fix.
* Support for quantized mixtral.
* Support mixtral in the quantized example.
* Mlp or moe type.
* Fix the expert field namings.
* Refactor the mlp bit.
* More MoE logic.
* Add the MoE quantized logic.
* Fix the experts length.
2023-12-15 19:16:06 -06:00
614842b311
Add the Mixtral model. ( #1437 )
...
* Add the Mixtral model.
* Add more of the mixtral layers.
* Add the final layers for mixtral.
* Sketch the expert selection.
* Add some expert routing logic.
* Hopefully finish the routing logic for mixtral.
* Add the mixtral example.
* Fix the weight filenames.
* Bugfix.
* Another fix.
* Yet another fix + remove the unused pragma.
* Shape fix.
* Add a readme.
2023-12-15 14:19:56 -06:00
79eab519fd
Fix phi example ( #1436 )
...
* Fix phi example
* Remove the cuda mention.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com >
2023-12-15 07:01:10 -06:00
6bc92e63cb
Addressing a lot of comments.
2023-12-15 13:06:04 +01:00
aa04015098
Remove unwrap()
.
2023-12-15 12:23:28 +01:00
8b5059e951
Remove test file.
2023-12-15 11:55:30 +01:00
26540641c1
Renamed all kernel names.
2023-12-15 11:24:47 +01:00
34d83377f6
Better error message on older macos
2023-12-15 11:18:54 +01:00
77197379cc
More cleanup.
2023-12-15 11:17:05 +01:00
916a8c5464
Revert candle-transformers.
2023-12-15 11:15:21 +01:00
243e83f2b9
Adding a bunch of docs !
...
Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com >
2023-12-15 11:03:05 +01:00
cf27868b57
More cleanup.
2023-12-15 01:44:22 +01:00
40c3e1bd5a
cleanup.
2023-12-15 01:41:14 +01:00
ece4c69a68
Fixing softmax.
2023-12-15 01:35:08 +01:00