|
6ebe043273
|
Merge branch 'main' into ivarflakstad/metal-prng
|
2024-01-07 11:52:03 +01:00 |
|
|
6bf52b9fdf
|
Gaussian normal distribution of PRNG via Box-Muller transform
|
2024-01-07 11:39:46 +01:00 |
|
|
955e63c803
|
Implement hybrid Tausworthe + LCG psuedo random number generator in metal
|
2024-01-05 13:27:59 +01:00 |
|
|
fa3ea98ba9
|
Adding bfloat16 support for the cast kernels. (#1520)
|
2024-01-04 12:12:56 +01:00 |
|
|
0a245e6fa4
|
Metal: support unary abs (#1503)
* Metal: support unary abs
* cargo fmt
|
2023-12-30 00:00:12 +01:00 |
|
|
87d7f81b43
|
Metal: more u8/u32 (#1502)
* Adds more metal u8
* Metal: more u32
|
2023-12-29 23:56:21 +01:00 |
|
|
4373534d59
|
Metal: i64 basic support (#1495)
* Adds basic metal i64 support
* metal copy i64
|
2023-12-29 19:42:50 +01:00 |
|
|
488e02a3f6
|
Merge pull request #1496 from bayedieng/unary
Implement urecip op for metal backend
|
2023-12-29 12:20:52 +01:00 |
|
|
f5c98f22c7
|
Merge pull request #1491 from mimiquate/metal-errors
Improves metal's not implemented error messages
|
2023-12-29 12:03:40 +01:00 |
|
|
cc06ba2294
|
fix bad pattern matching and function name
|
2023-12-29 09:46:24 +00:00 |
|
|
3922b42c18
|
add urecip op to metal backend
|
2023-12-28 21:50:12 +00:00 |
|
|
1e442d4bb9
|
Fix lints for clippy 1.75. (#1494)
|
2023-12-28 20:26:20 +01:00 |
|
|
8e93e76a91
|
fixes error message
|
2023-12-28 15:03:05 -03:00 |
|
|
b3e838f3e2
|
cargo fmt
|
2023-12-28 14:07:34 -03:00 |
|
|
8bf892403a
|
Improves metal's not implemented error messages
|
2023-12-28 11:04:06 -03:00 |
|
|
d35f0a1376
|
Bump the crate version to 0.3.3. (#1490)
|
2023-12-28 13:38:30 +01:00 |
|
|
13a5d15ebc
|
Adding upsample_nearest_2d.
|
2023-12-25 14:25:19 +01:00 |
|
|
1505d85276
|
Merge pull request #1461 from huggingface/metal-conv
Adding the convolutions (1d + 2d) to candle on metal.
|
2023-12-25 12:48:09 +01:00 |
|
|
95e18ef675
|
Fixing matmul for convolutions.
|
2023-12-25 12:29:34 +01:00 |
|
|
7135791dd5
|
Fix the quantized mistral example. (#1478)
|
2023-12-25 09:31:24 +01:00 |
|
|
ba1fae590e
|
Validate the kernel size in pooling ops. (#1473)
* Validate the kernel size in pooling ops.
* Revert the changes to basics.
|
2023-12-23 11:19:22 +01:00 |
|
|
ceb78d3e28
|
Sketch the minimal mamba example. (#1465)
* Sketch the minimal mamba example.
* Fix rustfmt.
* Forward pass for mamba.
* Finish the forward pass.
* Inference fixes.
* Bugfixes.
* More fixes.
* Add a readme.
|
2023-12-22 00:28:50 +01:00 |
|
|
10d94659c3
|
Adding the convolutions (1d + 2d) to candle on metal.
|
2023-12-21 10:39:24 +01:00 |
|
|
9fc210fae8
|
Merge pull request #1318 from huggingface/metal4
Starting to fix some tests.
|
2023-12-20 15:37:31 +01:00 |
|
|
9b5e4843a6
|
Optimizing decode matmul (Phi at 28tok/s on M3).
Adding some benchmark in order to help checking out matmul performance.
|
2023-12-20 09:54:19 +01:00 |
|
|
03641293ee
|
Clippy pass.
|
2023-12-18 15:22:43 +01:00 |
|
|
064ba17bd7
|
Remove print.
|
2023-12-18 11:04:16 +01:00 |
|
|
e8ee253ee0
|
Missing cast.
|
2023-12-18 11:01:18 +01:00 |
|
|
8bd3d6b94b
|
Index add.
|
2023-12-18 10:46:01 +01:00 |
|
|
6a3ca7da0c
|
Scatter add.
|
2023-12-18 10:32:22 +01:00 |
|
|
96f1a28e39
|
Add a simple full method. (#1455)
* Add a simple implementation of the full method.
* Add the docstring.
|
2023-12-17 20:15:57 -05:00 |
|
|
586b6f6fff
|
Adding gather op.
|
2023-12-17 23:34:12 +01:00 |
|
|
e4b0cc59f5
|
Adding CMP
|
2023-12-17 22:32:25 +01:00 |
|
|
0a6e0a8c9a
|
Implement randn (CPU-> device)
|
2023-12-17 19:09:08 +01:00 |
|
|
972903021c
|
Finish reduce kernels.
|
2023-12-17 19:07:00 +01:00 |
|
|
94817dac56
|
Bump the crate version to 0.3.2. (#1452)
|
2023-12-17 05:34:53 -06:00 |
|
|
1e86717bf2
|
Fix a couple typos (#1451)
* Mixtral quantized instruct.
* Fix a couple typos.
|
2023-12-17 05:20:05 -06:00 |
|
|
6bc92e63cb
|
Addressing a lot of comments.
|
2023-12-15 13:06:04 +01:00 |
|
|
aa04015098
|
Remove unwrap() .
|
2023-12-15 12:23:28 +01:00 |
|
|
26540641c1
|
Renamed all kernel names.
|
2023-12-15 11:24:47 +01:00 |
|
|
77197379cc
|
More cleanup.
|
2023-12-15 11:17:05 +01:00 |
|
|
243e83f2b9
|
Adding a bunch of docs !
Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com>
|
2023-12-15 11:03:05 +01:00 |
|
|
40c3e1bd5a
|
cleanup.
|
2023-12-15 01:41:14 +01:00 |
|
|
ece4c69a68
|
Fixing softmax.
|
2023-12-15 01:35:08 +01:00 |
|
|
4eeaf205d6
|
Fix softmax for long sequences (missing barrier).
|
2023-12-14 19:37:03 +01:00 |
|
|
361f2ad2af
|
Working with merging encoders and using fences.
|
2023-12-14 16:05:33 +01:00 |
|
|
931432ed55
|
Fixing tests + matmul from MFA
|
2023-12-13 16:58:36 +01:00 |
|
|
0404a3eb5b
|
Removed MPSMatrix entirely (buggy).
|
2023-12-13 16:21:48 +01:00 |
|
|
a9d0657432
|
Better version ?
|
2023-12-13 12:09:20 +01:00 |
|
|
4cb443d00a
|
Fix the logsumexp test. (#1426)
|
2023-12-12 10:56:11 -06:00 |
|