Commit Graph

1634 Commits

Author SHA1 Message Date
d35f0a1376 Bump the crate version to 0.3.3. (#1490) 2023-12-28 13:38:30 +01:00
65cb90bd40 Add some mention to SOLAR-10.7B in the readme. (#1487) 2023-12-27 15:25:39 +01:00
996a7f2e24 Rework the llama example config, add the solar model. (#1485) 2023-12-26 22:24:04 +01:00
3071ea6c3e Use the new hub helper function. (#1484) 2023-12-26 09:44:30 +01:00
37c539f2b7 Helper function to load sharded safetensors files (#1481)
* Fix the quantized mistral example.

* Add a helper function to load sharded safetensors weights.

* Use the sharded loader.
2023-12-25 21:49:21 +01:00
eae3a20d43 Merge pull request #1479 from huggingface/upsample_metal
Adding upsample_nearest_2d.
2023-12-25 14:25:53 +01:00
13a5d15ebc Adding upsample_nearest_2d. 2023-12-25 14:25:19 +01:00
1505d85276 Merge pull request #1461 from huggingface/metal-conv
Adding the convolutions (1d + 2d) to candle on metal.
2023-12-25 12:48:09 +01:00
95e18ef675 Fixing matmul for convolutions. 2023-12-25 12:29:34 +01:00
7135791dd5 Fix the quantized mistral example. (#1478) 2023-12-25 09:31:24 +01:00
88589d8815 Support mistral instruct v0.2. (#1475)
* Support mistral instruct v0.2.

* Use the safetensors model now that they are available.
2023-12-23 16:18:49 +01:00
5b35fd0fcf MMLU evaluation for Phi. (#1474)
* MMLU evaluation for Phi.

* Improve the evaluation.
2023-12-23 15:28:36 +01:00
ba1fae590e Validate the kernel size in pooling ops. (#1473)
* Validate the kernel size in pooling ops.

* Revert the changes to basics.
2023-12-23 11:19:22 +01:00
78d982e1bd Fix for mamba 2.8b. (#1472) 2023-12-23 11:01:39 +01:00
d8b9a727fc Support different mamba models. (#1471) 2023-12-23 10:46:02 +01:00
ceb78d3e28 Sketch the minimal mamba example. (#1465)
* Sketch the minimal mamba example.

* Fix rustfmt.

* Forward pass for mamba.

* Finish the forward pass.

* Inference fixes.

* Bugfixes.

* More fixes.

* Add a readme.
2023-12-22 00:28:50 +01:00
f6408a3779 feat: add clear_kv_cache to mistral and qmistral models (#1464) 2023-12-21 21:19:19 +01:00
10d94659c3 Adding the convolutions (1d + 2d) to candle on metal. 2023-12-21 10:39:24 +01:00
563a79afa1 make fn name generic (#1459)
Co-authored-by: Ubuntu <danielclough@users.noreply.github.com>
2023-12-21 02:16:31 +01:00
8ede5f4210 add fn config_chat_ml (#1458)
* add fn config_chat_ml

* Add a link to the original config.

---------

Co-authored-by: Ubuntu <danielclough@users.noreply.github.com>
Co-authored-by: laurent <laurent.mazare@gmail.com>
2023-12-20 21:03:24 +01:00
9fc210fae8 Merge pull request #1318 from huggingface/metal4
Starting to fix some tests.
2023-12-20 15:37:31 +01:00
9b5e4843a6 Optimizing decode matmul (Phi at 28tok/s on M3).
Adding some benchmark in order to help checking out matmul performance.
2023-12-20 09:54:19 +01:00
03641293ee Clippy pass. 2023-12-18 15:22:43 +01:00
064ba17bd7 Remove print. 2023-12-18 11:04:16 +01:00
e8ee253ee0 Missing cast. 2023-12-18 11:01:18 +01:00
8bd3d6b94b Index add. 2023-12-18 10:46:01 +01:00
6a3ca7da0c Scatter add. 2023-12-18 10:32:22 +01:00
96f1a28e39 Add a simple full method. (#1455)
* Add a simple implementation of the full method.

* Add the docstring.
2023-12-17 20:15:57 -05:00
586b6f6fff Adding gather op. 2023-12-17 23:34:12 +01:00
e4b0cc59f5 Adding CMP 2023-12-17 22:32:25 +01:00
0a6e0a8c9a Implement randn (CPU-> device) 2023-12-17 19:09:08 +01:00
972903021c Finish reduce kernels. 2023-12-17 19:07:00 +01:00
94817dac56 Bump the crate version to 0.3.2. (#1452) 2023-12-17 05:34:53 -06:00
1e86717bf2 Fix a couple typos (#1451)
* Mixtral quantized instruct.

* Fix a couple typos.
2023-12-17 05:20:05 -06:00
c630622a07 Expose AdamW parameters (#1449)
* Expose AdamW parameters

* Use reference
2023-12-16 18:41:56 -06:00
c4cfcf1539 Tweak the readme for phi and the default sample length. (#1450) 2023-12-16 18:11:36 -06:00
1782e93de6 Mixtral quantized instruct. (#1447) 2023-12-16 16:16:39 -06:00
cfdf9640a3 Readme tweaks. (#1446) 2023-12-16 06:23:12 -06:00
e12cbfd73b Update the readme to mention mixtral. (#1443) 2023-12-15 19:29:03 -06:00
30a958e5dd Quantized mixtral model (#1442)
* Add the Mixtral model.

* Add more of the mixtral layers.

* Add the final layers for mixtral.

* Sketch the expert selection.

* Add some expert routing logic.

* Hopefully finish the routing logic for mixtral.

* Add the mixtral example.

* Fix the weight filenames.

* Bugfix.

* Another fix.

* Yet another fix + remove the unused pragma.

* Shape fix.

* Support for quantized mixtral.

* Support mixtral in the quantized example.

* Mlp or moe type.

* Fix the expert field namings.

* Refactor the mlp bit.

* More MoE logic.

* Add the MoE quantized logic.

* Fix the experts length.
2023-12-15 19:16:06 -06:00
614842b311 Add the Mixtral model. (#1437)
* Add the Mixtral model.

* Add more of the mixtral layers.

* Add the final layers for mixtral.

* Sketch the expert selection.

* Add some expert routing logic.

* Hopefully finish the routing logic for mixtral.

* Add the mixtral example.

* Fix the weight filenames.

* Bugfix.

* Another fix.

* Yet another fix + remove the unused pragma.

* Shape fix.

* Add a readme.
2023-12-15 14:19:56 -06:00
79eab519fd Fix phi example (#1436)
* Fix phi example

* Remove the cuda mention.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
2023-12-15 07:01:10 -06:00
6bc92e63cb Addressing a lot of comments. 2023-12-15 13:06:04 +01:00
aa04015098 Remove unwrap(). 2023-12-15 12:23:28 +01:00
8b5059e951 Remove test file. 2023-12-15 11:55:30 +01:00
26540641c1 Renamed all kernel names. 2023-12-15 11:24:47 +01:00
34d83377f6 Better error message on older macos 2023-12-15 11:18:54 +01:00
77197379cc More cleanup. 2023-12-15 11:17:05 +01:00
916a8c5464 Revert candle-transformers. 2023-12-15 11:15:21 +01:00
243e83f2b9 Adding a bunch of docs !
Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com>
2023-12-15 11:03:05 +01:00