candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-21 04:10:46 +00:00

Author	SHA1	Message	Date
Ivar Flakstad	e8e24f1284	Follow crate conventions	2024-01-01 20:37:56 +01:00
Ivar Flakstad	6eb44d1bce	Added fill bench	2024-01-01 20:22:44 +01:00
Ivar Flakstad	7fc26764b6	Implement generic fill. u8 uses speedy blit encoder	2023-12-29 16:02:29 +01:00
Ivar Flakstad	0a29d2e9b8	Add fill kernel handler	2023-12-29 12:27:12 +01:00
Nicolas Patry	10d94659c3	Adding the convolutions (1d + 2d) to candle on metal.	2023-12-21 10:39:24 +01:00
Nicolas Patry	03641293ee	Clippy pass.	2023-12-18 15:22:43 +01:00
Nicolas Patry	972903021c	Finish reduce kernels.	2023-12-17 19:07:00 +01:00
Nicolas Patry	6bc92e63cb	Addressing a lot of comments.	2023-12-15 13:06:04 +01:00
Nicolas Patry	4eeaf205d6	Fix softmax for long sequences (missing barrier).	2023-12-14 19:37:03 +01:00
Nicolas Patry	931432ed55	Fixing tests + matmul from MFA	2023-12-13 16:58:36 +01:00
Juarez Bochi	6e25822d4f	Fix gelu for large x	2023-12-06 09:59:44 -05:00
Nicolas Patry	2ca086939f	Put back affine strided tests	2023-11-30 11:40:39 +01:00
Nicolas Patry	4349ff1fc2	Starting to fix some tests. Few fixes. Going back on remote metal-rs. Reusing a single buffer (for now) to speed things up. Adding some half kernels. All tests are panicking instead of random failure. Putting back f16 index select. Add erf. Working version for llama2-c. Fixes + cache compute_pipeline_state. BF16 metal fix. Remove some prints. new_owned -> new()..to_owned(). Better batched matmul. Metal operational. Reuse buffers on our own reference counts. Tmp gemm. Revert "Tmp gemm." This reverts commit `c65f68e988`. Interleave committing. Speeding up copies using blit. Fmt. Fmt. Remove the assert! Fmt all. Fixes after big rebase. Add softmax for half and bfloat + tests Fixing Llama example + accumulate softmax in float.	2023-11-30 11:30:31 +01:00
Nicolas Patry	60f624a902	Moving tests around.	2023-11-20 16:17:19 +01:00

14 Commits