candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Nicolas Patry	13a5d15ebc	Adding upsample_nearest_2d.	2023-12-25 14:25:19 +01:00
Nicolas Patry	1505d85276	Merge pull request #1461 from huggingface/metal-conv Adding the convolutions (1d + 2d) to candle on metal.	2023-12-25 12:48:09 +01:00
Nicolas Patry	95e18ef675	Fixing matmul for convolutions.	2023-12-25 12:29:34 +01:00
Laurent Mazare	7135791dd5	Fix the quantized mistral example. (#1478 )	2023-12-25 09:31:24 +01:00
Laurent Mazare	ba1fae590e	Validate the kernel size in pooling ops. (#1473 ) * Validate the kernel size in pooling ops. * Revert the changes to basics.	2023-12-23 11:19:22 +01:00
Laurent Mazare	ceb78d3e28	Sketch the minimal mamba example. (#1465 ) * Sketch the minimal mamba example. * Fix rustfmt. * Forward pass for mamba. * Finish the forward pass. * Inference fixes. * Bugfixes. * More fixes. * Add a readme.	2023-12-22 00:28:50 +01:00
Nicolas Patry	10d94659c3	Adding the convolutions (1d + 2d) to candle on metal.	2023-12-21 10:39:24 +01:00
Nicolas Patry	9fc210fae8	Merge pull request #1318 from huggingface/metal4 Starting to fix some tests.	2023-12-20 15:37:31 +01:00
Nicolas Patry	9b5e4843a6	Optimizing decode matmul (Phi at 28tok/s on M3). Adding some benchmark in order to help checking out matmul performance.	2023-12-20 09:54:19 +01:00
Nicolas Patry	03641293ee	Clippy pass.	2023-12-18 15:22:43 +01:00
Nicolas Patry	064ba17bd7	Remove print.	2023-12-18 11:04:16 +01:00
Nicolas Patry	e8ee253ee0	Missing cast.	2023-12-18 11:01:18 +01:00
Nicolas Patry	8bd3d6b94b	Index add.	2023-12-18 10:46:01 +01:00
Nicolas Patry	6a3ca7da0c	Scatter add.	2023-12-18 10:32:22 +01:00
Laurent Mazare	96f1a28e39	Add a simple full method. (#1455 ) * Add a simple implementation of the full method. * Add the docstring.	2023-12-17 20:15:57 -05:00
Nicolas Patry	586b6f6fff	Adding gather op.	2023-12-17 23:34:12 +01:00
Nicolas Patry	e4b0cc59f5	Adding CMP	2023-12-17 22:32:25 +01:00
Nicolas Patry	0a6e0a8c9a	Implement randn (CPU-> device)	2023-12-17 19:09:08 +01:00
Nicolas Patry	972903021c	Finish reduce kernels.	2023-12-17 19:07:00 +01:00
Laurent Mazare	94817dac56	Bump the crate version to 0.3.2. (#1452 )	2023-12-17 05:34:53 -06:00
Laurent Mazare	1e86717bf2	Fix a couple typos (#1451 ) * Mixtral quantized instruct. * Fix a couple typos.	2023-12-17 05:20:05 -06:00
Nicolas Patry	6bc92e63cb	Addressing a lot of comments.	2023-12-15 13:06:04 +01:00
Nicolas Patry	aa04015098	Remove `unwrap()`.	2023-12-15 12:23:28 +01:00
Nicolas Patry	26540641c1	Renamed all kernel names.	2023-12-15 11:24:47 +01:00
Nicolas Patry	77197379cc	More cleanup.	2023-12-15 11:17:05 +01:00
Nicolas Patry	243e83f2b9	Adding a bunch of docs ! Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com>	2023-12-15 11:03:05 +01:00
Nicolas Patry	40c3e1bd5a	cleanup.	2023-12-15 01:41:14 +01:00
Nicolas Patry	ece4c69a68	Fixing softmax.	2023-12-15 01:35:08 +01:00
Nicolas Patry	4eeaf205d6	Fix softmax for long sequences (missing barrier).	2023-12-14 19:37:03 +01:00
Nicolas Patry	361f2ad2af	Working with merging encoders and using fences.	2023-12-14 16:05:33 +01:00
Nicolas Patry	931432ed55	Fixing tests + matmul from MFA	2023-12-13 16:58:36 +01:00
Nicolas Patry	0404a3eb5b	Removed MPSMatrix entirely (buggy).	2023-12-13 16:21:48 +01:00
Nicolas Patry	a9d0657432	Better version ?	2023-12-13 12:09:20 +01:00
Laurent Mazare	4cb443d00a	Fix the logsumexp test. (#1426 )	2023-12-12 10:56:11 -06:00
nicolas	87dc559817	Lots of updates including some stack of command buffers.	2023-12-12 17:41:56 +01:00
Wenqing Zong	77252ffb82	Add logsumexp function (#1424 )	2023-12-12 10:32:17 -06:00
KGrewal1	18eb87f25f	Upsample grad (#1420 ) * encode size of upsample in enum * working convolution method for limited 2d kernels * add test for sf 3 interpolation * add higher dimensional tests, fix to work with multichannel input * Remove commented out line. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2023-12-10 08:43:24 +01:00
Nicolas Patry	4349ff1fc2	Starting to fix some tests. Few fixes. Going back on remote metal-rs. Reusing a single buffer (for now) to speed things up. Adding some half kernels. All tests are panicking instead of random failure. Putting back f16 index select. Add erf. Working version for llama2-c. Fixes + cache compute_pipeline_state. BF16 metal fix. Remove some prints. new_owned -> new()..to_owned(). Better batched matmul. Metal operational. Reuse buffers on our own reference counts. Tmp gemm. Revert "Tmp gemm." This reverts commit `c65f68e988`. Interleave committing. Speeding up copies using blit. Fmt. Fmt. Remove the assert! Fmt all. Fixes after big rebase. Add softmax for half and bfloat + tests Fixing Llama example + accumulate softmax in float.	2023-11-30 11:30:31 +01:00
Nicolas Patry	e2eb6590ed	Merge pull request #1323 from huggingface/metal3 Adding the test scaffolding.	2023-11-27 13:06:01 +01:00
Laurent Mazare	481c45d78d	Add a basic implementation for slice-assign. (#1377 )	2023-11-26 17:31:22 +00:00
Laurent Mazare	14a2bdc062	Small tweak: remove the macro usage for the range indexing trait. (#1376 )	2023-11-26 16:30:59 +00:00
Laurent Mazare	bfa7c8fc01	Implement the module trait directly for QMatMul. (#1372 )	2023-11-25 10:09:45 +00:00
Nicolas Patry	1edc3ddf24	Allowing feature metal to compile.	2023-11-20 20:17:16 +01:00
Nicolas Patry	8d6c6de8e0	Missing new test.	2023-11-20 14:38:35 +01:00
Nicolas Patry	7ec345c2eb	Adding the test scaffolding.	2023-11-20 14:38:35 +01:00
Nicolas Patry	671fc29b36	Fmt.	2023-11-20 14:38:20 +01:00
Nicolas Patry	c66e5d4716	Fix comments.	2023-11-20 14:13:44 +01:00
Nicolas Patry	2813fb5dbc	Cleanup fixed a few ops removed debugging scaffolding.	2023-11-20 14:12:57 +01:00
Nicolas Patry	7cfffcac10	Debugging rope.	2023-11-20 14:12:57 +01:00
Nicolas Patry	38de52bc4b	Fixed matmul (display still broken without casting back to CPU first? )	2023-11-20 14:12:57 +01:00

... 3 4 5 6 7 ...

755 Commits