Commit Graph

14 Commits

Author SHA1 Message Date
e8e24f1284 Follow crate conventions 2024-01-01 20:37:56 +01:00
6eb44d1bce Added fill bench 2024-01-01 20:22:44 +01:00
7fc26764b6 Implement generic fill. u8 uses speedy blit encoder 2023-12-29 16:02:29 +01:00
0a29d2e9b8 Add fill kernel handler 2023-12-29 12:27:12 +01:00
10d94659c3 Adding the convolutions (1d + 2d) to candle on metal. 2023-12-21 10:39:24 +01:00
03641293ee Clippy pass. 2023-12-18 15:22:43 +01:00
972903021c Finish reduce kernels. 2023-12-17 19:07:00 +01:00
6bc92e63cb Addressing a lot of comments. 2023-12-15 13:06:04 +01:00
4eeaf205d6 Fix softmax for long sequences (missing barrier). 2023-12-14 19:37:03 +01:00
931432ed55 Fixing tests + matmul from MFA 2023-12-13 16:58:36 +01:00
6e25822d4f Fix gelu for large x 2023-12-06 09:59:44 -05:00
2ca086939f Put back affine strided tests 2023-11-30 11:40:39 +01:00
4349ff1fc2 Starting to fix some tests.
Few fixes.

Going back on remote metal-rs.

Reusing a single buffer (for now) to speed things up.

Adding some half kernels.

All tests are panicking instead of random failure.

Putting back f16 index select.

Add erf.

Working version for llama2-c.

Fixes + cache compute_pipeline_state.

BF16 metal fix.

Remove some prints.

new_owned -> new()..to_owned().

Better batched matmul.

Metal operational.

Reuse buffers on our own reference counts.

Tmp gemm.

Revert "Tmp gemm."

This reverts commit c65f68e988.

Interleave committing.

Speeding up copies using blit.

Fmt.

Fmt.

Remove the assert!

Fmt all.

Fixes after big rebase.

Add softmax for half and bfloat + tests

Fixing Llama example + accumulate softmax in float.
2023-11-30 11:30:31 +01:00
60f624a902 Moving tests around. 2023-11-20 16:17:19 +01:00