candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-22 04:22:50 +00:00

Author	SHA1	Message	Date
Nicolas Patry	9130b6c4b6	Removing the fences speeds everything up and is correct this time...	2024-01-05 19:26:30 +01:00
Nicolas Patry	7b4389099a	Fix the rebase.	2024-01-05 14:31:39 +01:00
Nicolas Patry	6f8584091e	Cleanup.	2024-01-05 14:29:43 +01:00
Nicolas Patry	f97fcd4712	Metal quantized modifications proposal. - Add a device param, wherever needed. - Create new QMetal storage thing that implements QuantizedType. - Update everywhere needed. Fix Python. Fixing examples. Fix: fmt + clippy + stub. Moving everything around. Only missing the actual implems. Fixing everything + adding dequantized kernels. More work. Fixing matmul. Fmt + Clippy Some clippy fixes. Working state. Q2K Metal -> Bugged (also present in GGML). Q4K CPU -> Bugged (present previously, new test catch it). Q5K CPU -> Bugged (present previously). Q8_1 Both -> Never really implemented it seems Q8K metal -> Never implemented in metal Fixing Q2K bug (present in ggml).	2024-01-05 14:29:41 +01:00
Nicolas Patry	fa3ea98ba9	Adding bfloat16 support for the cast kernels. (#1520 )	2024-01-04 12:12:56 +01:00
Gonzalo	0a245e6fa4	Metal: support unary abs (#1503 ) * Metal: support unary abs * cargo fmt	2023-12-30 00:00:12 +01:00
Gonzalo	87d7f81b43	Metal: more u8/u32 (#1502 ) * Adds more metal u8 * Metal: more u32	2023-12-29 23:56:21 +01:00
Gonzalo	4373534d59	Metal: i64 basic support (#1495 ) * Adds basic metal i64 support * metal copy i64	2023-12-29 19:42:50 +01:00
Nicolas Patry	488e02a3f6	Merge pull request #1496 from bayedieng/unary Implement urecip op for metal backend	2023-12-29 12:20:52 +01:00
Nicolas Patry	f5c98f22c7	Merge pull request #1491 from mimiquate/metal-errors Improves metal's not implemented error messages	2023-12-29 12:03:40 +01:00
Baye Dieng	cc06ba2294	fix bad pattern matching and function name	2023-12-29 09:46:24 +00:00
Baye Dieng	3922b42c18	add urecip op to metal backend	2023-12-28 21:50:12 +00:00
Laurent Mazare	1e442d4bb9	Fix lints for clippy 1.75. (#1494 )	2023-12-28 20:26:20 +01:00
Gonzalo	8e93e76a91	fixes error message	2023-12-28 15:03:05 -03:00
Gonzalo	b3e838f3e2	cargo fmt	2023-12-28 14:07:34 -03:00
Gonzalo	8bf892403a	Improves metal's not implemented error messages	2023-12-28 11:04:06 -03:00
Nicolas Patry	13a5d15ebc	Adding upsample_nearest_2d.	2023-12-25 14:25:19 +01:00
Nicolas Patry	1505d85276	Merge pull request #1461 from huggingface/metal-conv Adding the convolutions (1d + 2d) to candle on metal.	2023-12-25 12:48:09 +01:00
Nicolas Patry	95e18ef675	Fixing matmul for convolutions.	2023-12-25 12:29:34 +01:00
Laurent Mazare	7135791dd5	Fix the quantized mistral example. (#1478 )	2023-12-25 09:31:24 +01:00
Laurent Mazare	ba1fae590e	Validate the kernel size in pooling ops. (#1473 ) * Validate the kernel size in pooling ops. * Revert the changes to basics.	2023-12-23 11:19:22 +01:00
Nicolas Patry	10d94659c3	Adding the convolutions (1d + 2d) to candle on metal.	2023-12-21 10:39:24 +01:00
Nicolas Patry	9fc210fae8	Merge pull request #1318 from huggingface/metal4 Starting to fix some tests.	2023-12-20 15:37:31 +01:00
Nicolas Patry	03641293ee	Clippy pass.	2023-12-18 15:22:43 +01:00
Nicolas Patry	e8ee253ee0	Missing cast.	2023-12-18 11:01:18 +01:00
Nicolas Patry	8bd3d6b94b	Index add.	2023-12-18 10:46:01 +01:00
Nicolas Patry	6a3ca7da0c	Scatter add.	2023-12-18 10:32:22 +01:00
Laurent Mazare	96f1a28e39	Add a simple full method. (#1455 ) * Add a simple implementation of the full method. * Add the docstring.	2023-12-17 20:15:57 -05:00
Nicolas Patry	586b6f6fff	Adding gather op.	2023-12-17 23:34:12 +01:00
Nicolas Patry	e4b0cc59f5	Adding CMP	2023-12-17 22:32:25 +01:00
Nicolas Patry	0a6e0a8c9a	Implement randn (CPU-> device)	2023-12-17 19:09:08 +01:00
Nicolas Patry	972903021c	Finish reduce kernels.	2023-12-17 19:07:00 +01:00
Laurent Mazare	1e86717bf2	Fix a couple typos (#1451 ) * Mixtral quantized instruct. * Fix a couple typos.	2023-12-17 05:20:05 -06:00
Nicolas Patry	6bc92e63cb	Addressing a lot of comments.	2023-12-15 13:06:04 +01:00
Nicolas Patry	aa04015098	Remove `unwrap()`.	2023-12-15 12:23:28 +01:00
Nicolas Patry	26540641c1	Renamed all kernel names.	2023-12-15 11:24:47 +01:00
Nicolas Patry	77197379cc	More cleanup.	2023-12-15 11:17:05 +01:00
Nicolas Patry	243e83f2b9	Adding a bunch of docs ! Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com>	2023-12-15 11:03:05 +01:00
Nicolas Patry	40c3e1bd5a	cleanup.	2023-12-15 01:41:14 +01:00
Nicolas Patry	ece4c69a68	Fixing softmax.	2023-12-15 01:35:08 +01:00
Nicolas Patry	4eeaf205d6	Fix softmax for long sequences (missing barrier).	2023-12-14 19:37:03 +01:00
Nicolas Patry	361f2ad2af	Working with merging encoders and using fences.	2023-12-14 16:05:33 +01:00
Nicolas Patry	931432ed55	Fixing tests + matmul from MFA	2023-12-13 16:58:36 +01:00
Nicolas Patry	0404a3eb5b	Removed MPSMatrix entirely (buggy).	2023-12-13 16:21:48 +01:00
Nicolas Patry	a9d0657432	Better version ?	2023-12-13 12:09:20 +01:00
nicolas	87dc559817	Lots of updates including some stack of command buffers.	2023-12-12 17:41:56 +01:00
Wenqing Zong	77252ffb82	Add logsumexp function (#1424 )	2023-12-12 10:32:17 -06:00
KGrewal1	18eb87f25f	Upsample grad (#1420 ) * encode size of upsample in enum * working convolution method for limited 2d kernels * add test for sf 3 interpolation * add higher dimensional tests, fix to work with multichannel input * Remove commented out line. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2023-12-10 08:43:24 +01:00
Nicolas Patry	4349ff1fc2	Starting to fix some tests. Few fixes. Going back on remote metal-rs. Reusing a single buffer (for now) to speed things up. Adding some half kernels. All tests are panicking instead of random failure. Putting back f16 index select. Add erf. Working version for llama2-c. Fixes + cache compute_pipeline_state. BF16 metal fix. Remove some prints. new_owned -> new()..to_owned(). Better batched matmul. Metal operational. Reuse buffers on our own reference counts. Tmp gemm. Revert "Tmp gemm." This reverts commit `c65f68e988`. Interleave committing. Speeding up copies using blit. Fmt. Fmt. Remove the assert! Fmt all. Fixes after big rebase. Add softmax for half and bfloat + tests Fixing Llama example + accumulate softmax in float.	2023-11-30 11:30:31 +01:00
Nicolas Patry	e2eb6590ed	Merge pull request #1323 from huggingface/metal3 Adding the test scaffolding.	2023-11-27 13:06:01 +01:00

1 2 3 4 5 ...

508 Commits