Commit Graph

  • 2feb0b054f Add the mel filters for 128 bins. (#1295) Laurent Mazare 2023-11-08 08:23:53 +01:00
  • 2d28497197 Preliminary support for whisper v3. (#1294) Laurent Mazare 2023-11-08 06:42:52 +01:00
  • f3a4f3db76 PyO3: Add optional candle.onnx module (#1282) Lukas Kreussel 2023-11-08 06:37:50 +01:00
  • eb24875856 Reworked affine and it works ? No idea how it's different. tmp_broken_metal Nicolas Patry 2023-11-08 02:34:08 +01:00
  • 3f662e54cd Reworked affine and it works ? No idea how it's different. Nicolas Patry 2023-11-08 02:34:08 +01:00
  • 480a3e22e6 Adding cast + binary kernels. Nicolas Patry 2023-11-07 23:45:53 +01:00
  • 7920b45c8a Support for timegroupnorm in encodec. (#1291) Laurent Mazare 2023-11-07 22:39:59 +01:00
  • d4a45c936a Quantized model small tweaks (#1290) Laurent Mazare 2023-11-07 21:21:37 +01:00
  • 0c24a885a6 Updated everything and output a trace. Nicolas Patry 2023-11-07 21:06:20 +01:00
  • c912d24570 Update README: Move T5 to Text to Text section (#1288) Juarez Bochi 2023-11-07 10:14:04 -05:00
  • d5c2a7b64b Add info about MADLAD-400 in readme files (#1287) Juarez Bochi 2023-11-07 09:21:59 -05:00
  • 76d3116f5d Broken metal ? Nicolas Patry 2023-11-07 14:20:13 +01:00
  • 1367e0278b pesky bfloat type Ivar Flakstad 2023-11-07 10:26:59 +01:00
  • 508f811b93 Add support for MADLAD400 (#1285) Juarez Bochi 2023-11-06 23:35:37 -05:00
  • 7ff17d92b3 Finished the unary Nicolas Patry 2023-11-06 23:12:12 +01:00
  • a773a4b22b [ONNX] Support a couple more ops. (#1284) Laurent Mazare 2023-11-06 22:44:58 +01:00
  • 5a363dbc26 Adds check for 7b-zephyr and uses correct template (#1283) DTJ11235 2023-11-06 20:05:39 +00:00
  • cd68c96803 Going overbounds will break other kernels running from other threads. Nicolas Patry 2023-11-06 17:29:58 +01:00
  • 4d87305c48 Float -> half / bfloat conversion in unary Ivar Flakstad 2023-11-06 17:09:39 +01:00
  • 677495f9b8 Working but failing tests because of threadgroup. Nicolas Patry 2023-11-06 17:04:47 +01:00
  • dedc8c3656 Writing unary as macro instead, protecting bfloat type with proper metal version. Nicolas Patry 2023-11-06 15:36:48 +01:00
  • abc4f698c5 Add candle-sampling (#1278) Eric Buehler 2023-11-06 06:53:29 -05:00
  • a923e8b53a Add a link to candle-ext to README.md (#1277) YiiSh 2023-11-06 19:44:39 +08:00
  • 63cce76b84 Improve metal kernel loading and associated errors Ivar Flakstad 2023-11-06 09:48:18 +01:00
  • 634a4e7168 BlitEncoder added to affine for copying buffer contents quickly. Ivar Flakstad 2023-11-06 08:23:36 +01:00
  • 2a45bcf943 Put the onnx example behind a feature flag. (#1276) Laurent Mazare 2023-11-06 07:45:07 +01:00
  • 47f4ddb011 Added info about missing protoc (#1275) figgefigge 2023-11-06 06:47:32 +01:00
  • 8124d1003f Affine metal kernel works. Need to extract buffer contents based on layout offset (like CudaSlice.slice) for candle intergration Ivar Flakstad 2023-11-06 04:46:53 +01:00
  • 6d4c8c0707 Use metal encode_gemm Ivar Flakstad 2023-11-06 03:27:22 +01:00
  • e6d33a8efb Remove unused utils.metal Ivar Flakstad 2023-11-06 03:26:21 +01:00
  • f365a075e5 Add more models to the onnx example. (#1273) Laurent Mazare 2023-11-05 16:57:26 +01:00
  • 60fdab4e17 Detach all grads during backprop. (#1243) Laurent Mazare 2023-11-05 14:07:41 +01:00
  • 928a9d906e [ONNX] Do not generate values for constants. (#1272) Laurent Mazare 2023-11-05 11:23:14 +01:00
  • d1d89bac1f feat: download cifar dataset parquet files (#1259) drbh 2023-11-05 04:55:49 -05:00
  • 39ad840a90 Better tensor initialization in ONNX. (#1270) Laurent Mazare 2023-11-04 22:17:45 +01:00
  • b5e4f84bed Refactor the onnx attribute getters. (#1268) Laurent Mazare 2023-11-04 21:31:48 +01:00
  • 7051fb8098 feat: add backprop for elu (#1269) drbh 2023-11-04 16:26:41 -04:00
  • dc68c130e4 Support more ONNX ops. (#1267) Laurent Mazare 2023-11-04 15:10:14 +01:00
  • bc9a1bf239 Improve the ONNX basic example + bugfixes (#1266) Laurent Mazare 2023-11-04 10:02:47 +01:00
  • c921cc3784 Add Arc to metalstorage buffer for quick cloning Ivar Flakstad 2023-11-04 09:03:23 +01:00
  • d4d6850c78 Impl index_add via template for all types Ivar Flakstad 2023-11-04 08:46:08 +01:00
  • f7c957d64f ONNX casting support. (#1265) Laurent Mazare 2023-11-04 08:34:24 +01:00
  • 8cbb9d0e6c Add some preliminary ONNX support (#1260) Laurent Mazare 2023-11-04 06:36:05 +01:00
  • bfe95115c6 Update README.md (#1264) Yuchao Zhang 2023-11-04 12:32:32 +08:00
  • 6fa3151820 Allow using gguf-v3 files. (#1262) Laurent Mazare 2023-11-03 23:07:53 +01:00
  • 0a58886ccb add distil-whisper link (#1261) Radamés Ajna 2023-11-03 13:34:42 -07:00
  • 3173b1ce3b feat: impl backprop for erf and gelu-erf (#1258) drbh 2023-11-03 16:32:30 -04:00
  • e708d35e7f index_add works Ivar Flakstad 2023-11-03 21:12:52 +01:00
  • ad63f20781 add Kalosm to the list of external resources (#1257) ealmloff 2023-11-03 13:16:46 -05:00
  • 1cfc5d6d0c Backprop support for conv1d (cpu only for now). (#1255) Laurent Mazare 2023-11-03 14:23:53 +01:00
  • b07b2350b6 Test for the transposed conv1d. (#1254) Laurent Mazare 2023-11-03 13:10:28 +01:00
  • 1b5063f3ca Add vllm external resource (#1253) Eric Buehler 2023-11-03 07:40:31 -04:00
  • 0794e70a19 Debugging index_add. Ivar Flakstad 2023-11-03 12:08:58 +01:00
  • 3b0d1e7d03 Transposed conv1d in candle-nn. (#1252) Laurent Mazare 2023-11-03 11:18:25 +01:00
  • be4555c5a5 Add the conv-transpose1d op. (#1251) Laurent Mazare 2023-11-03 09:44:46 +01:00
  • 6975c65112 Share the layer-norm implementation. (#1248) Laurent Mazare 2023-11-03 06:30:05 +01:00
  • f57e3164ae Implemented cos for now. Nicolas Patry 2023-11-03 01:24:51 +01:00
  • a2a20aeecc Add the swiglu activation from the chatglm PR. (#1246) Laurent Mazare 2023-11-02 20:01:34 +01:00
  • e08fbb6543 Add support for distil whisper (#1245) Laurent Mazare 2023-11-02 19:32:35 +01:00
  • d39d0c40fd Add hard-sigmoid and hard-swish activations (#1244) jamjamjon 2023-11-03 01:20:27 +08:00
  • 9a27f11c3f Adding tons of profiling and removing the metal allocation (still slow). tmp-metal-span Nicolas Patry 2023-11-02 17:48:07 +01:00
  • 7161002a34 Finished scaffolding, lots of TODOs Nicolas Patry 2023-11-02 15:32:28 +01:00
  • b97463098c llama2-c wasm fix. llama2-wasm-fix Laurent 2023-11-02 10:31:47 +01:00
  • 82cce52e73 Rename candle-metal -> candle-metal-kernels Ivar Flakstad 2023-11-02 09:53:29 +01:00
  • fbd69f952c Lazy detach. (#1242) Laurent Mazare 2023-11-02 08:33:48 +01:00
  • 6c990a33ea Remove the unused pragma for marian. (#1236) Laurent Mazare 2023-11-01 21:04:52 +01:00
  • 1704f1b3ae Consolidate the with-tracing usage. (#1234) Laurent Mazare 2023-11-01 19:21:36 +01:00
  • 71fcb31873 Owned command buffer now. Nicolas Patry 2023-11-01 18:03:53 +01:00
  • 198009453a Matmul (no batch, no strided, f32, f32 only) sort of done. Nicolas Patry 2023-11-01 17:36:51 +01:00
  • 492d164235 More scaffolding, now need to implement matmul (for precompute_cos_sin to work). Nicolas Patry 2023-11-01 16:54:09 +01:00
  • 693fad511c Preliminary support for ssd1b. (#1233) Laurent Mazare 2023-11-01 15:37:52 +01:00
  • 2d84c16fed First pass (Quantized scaffolding work done + quantized example scaffolding). Nicolas Patry 2023-11-01 15:10:11 +01:00
  • 36fb84f038 Add a hack for generating random uniform/normal for f16/bf16. (#1228) Laurent Mazare 2023-10-31 21:27:59 +01:00
  • 4525b7b52a Initial setup Ivar Flakstad 2023-10-31 18:09:10 +01:00
  • c12ad45562 Add a KV cache to marian decoding. (#1226) Laurent Mazare 2023-10-31 09:47:44 +01:00
  • 7d0202710b Instructions for generating the tokenizer configs for marian-mt. (#1225) Laurent Mazare 2023-10-31 07:56:26 +01:00
  • 392a00a147 Add support for the marian base model. (#1221) Laurent Mazare 2023-10-30 20:20:36 +01:00
  • 4c967b9184 Use the hub files for the marian example. (#1220) Laurent Mazare 2023-10-30 18:29:36 +01:00
  • c05c0a8213 PyO3: Add equal and __richcmp__ to candle.Tensor (#1099) Lukas Kreussel 2023-10-30 16:17:28 +01:00
  • 969960847a Bugfixes for marian-mt. (#1219) Laurent Mazare 2023-10-30 12:44:19 +01:00
  • 5fc66bd4ba Support negative steps in arange. (#1218) Laurent Mazare 2023-10-30 08:40:54 +01:00
  • 2a890a5e57 Fix the tokenizer initialization for marian. marian-tok Laurent 2023-10-29 21:13:14 +01:00
  • 174b208052 PyO3: Better shape handling (#1143) Lukas Kreussel 2023-10-29 16:41:44 +01:00
  • 154c674a79 Add i64-abs. (#1216) Laurent Mazare 2023-10-29 16:28:53 +01:00
  • 7bbde55c61 Marian MT model (#1210) Laurent Mazare 2023-10-29 16:12:22 +01:00
  • c3f2676d49 PyO3: Add CI to build & upload wheels as artifacts. (#1215) Lukas Kreussel 2023-10-29 14:44:05 +01:00
  • 46d6566c99 Fix the conv2d gradient computation. (#1214) Laurent Mazare 2023-10-29 10:50:04 +01:00
  • 55bc3382cf Allow for different behavior between training and eval (#1213) Laurent Mazare 2023-10-29 07:53:09 +01:00
  • dece37c6f4 feat: implement VGG13, VGG16 and VGG19 (#1211) drbh 2023-10-29 02:10:23 -04:00
  • 498c50348c Add DDPG and fix Gym wrapper (#1207) Travis Hammond 2023-10-28 20:53:34 +02:00
  • 012ae0090e Infer the config for llama2-c. (#1208) Laurent Mazare 2023-10-28 20:00:39 +02:00
  • 95a857cf57 Move the llama2-c model in transformers. (#1205) Laurent Mazare 2023-10-28 17:51:19 +02:00
  • 612f5b8156 Make more models cloneable. (#1203) Laurent Mazare 2023-10-28 08:43:08 +02:00
  • ef33df7ae2 No need for the even constraint on vecdot-q40-q80. (#1202) Laurent Mazare 2023-10-28 08:23:59 +02:00
  • c8face3f95 Add the relu2 and relu6 activations. (#1201) Laurent Mazare 2023-10-27 21:51:16 +02:00
  • 85bea43e5b Make the whisper model cloneable (#1200) Laurent Mazare 2023-10-27 17:59:19 +02:00
  • b3181455d5 Add fuse-conv-bn method for Conv2d (#1196) jamjamjon 2023-10-27 22:56:50 +08:00
  • e2826e70b3 Add a quantized variant of llama2.c (#1197) Laurent Mazare 2023-10-27 15:34:06 +01:00
  • 916619f70b Minor cleanup (#1194) Laurent Mazare 2023-10-27 14:08:29 +01:00
  • 9b1158b315 Add some missing backtraces. (#1193) Laurent Mazare 2023-10-27 06:09:11 +01:00