Commit Graph

  • f3d472952f fix: candle-flash-attn linux and msvc build (#2829) xkeyC 2025-03-25 15:45:12 +08:00
  • 67b85f79f1 Pickle decoder fix and Long1 opcode addition. (#2824) Christian Balcom 2025-03-23 03:10:08 -04:00
  • 0b24f7f0a4 Fix for whisper example. rand::distribution is now rand::distr (#2811) Benjamin Beurdouche 2025-03-16 19:14:55 +01:00
  • 3afb04925a Allow for growing the default KV cache when needed. (#2810) Laurent Mazare 2025-03-16 17:30:25 +01:00
  • cbf5fc80c2 Add Gemma 3 1b IT toe Gemma examples (#2809) André Cipriani Bandarra 2025-03-16 16:00:48 +00:00
  • 468d1d525f Bump the crate version to 0.8.4. (#2808) 0.8.4 Laurent Mazare 2025-03-15 07:42:24 +01:00
  • c930ab7e1a upgrade half library to fix rand (#2806) Mike Seddon 2025-03-14 19:01:54 +11:00
  • 111edbc4ea Gemma 3 initial setup (text only). (#2802) Laurent Mazare 2025-03-14 07:56:02 +01:00
  • e286cf7cc9 Parse the json config for siglip models. (#2800) Laurent Mazare 2025-03-09 14:01:09 +01:00
  • e4ffb85228 Add ModernBert sentency classifier (#2796) Mikhail Panfilov 2025-03-08 16:48:22 +03:00
  • 37db86ff79 Allow ModernBert to be used to generate embeddings. (#2791) Andrew Wason 2025-03-03 06:39:04 -05:00
  • add3a714aa phi-4-mini (#2790) Jani Monoses 2025-03-01 11:07:29 +02:00
  • 26c16923b9 Make sorted_nodes pub function (#2780) Liang-Chi Hsieh 2025-02-22 01:23:45 -08:00
  • 9e8bf70333 Avoid some clippy lints on 1.85. (#2778) Laurent Mazare 2025-02-22 09:23:22 +00:00
  • 777ad954eb Avoid some clippy lints on 1.85. clippy-1.85 Laurent 2025-02-21 10:39:55 +01:00
  • ac9cdbd448 Refactor From<Tuple> implementations by using macros, add tests (#2762) Philip Fabianek 2025-02-19 10:58:29 +01:00
  • e6cc76fc37 Implement DeepSeek V2 (#2744) Eric Buehler 2025-02-19 04:51:01 -05:00
  • fd7f7242a1 Bump the crate version to 0.8.3 (#2772) 0.8.3 Laurent Mazare 2025-02-15 15:54:48 +01:00
  • 3ddd20a5aa update to cudarc to v0.13.5 to support cuda 12.8 (#2771) Michael McCulloch 2025-02-15 07:47:23 -07:00
  • 2423d633fc add dynamic position encoding to Siglip (#2770) Amélie Royer 2025-02-14 13:50:50 +01:00
  • 7c2449f623 Metal: Improved reduce and softmax (#1819) ivarflakstad 2025-02-08 07:27:01 +01:00
  • 0af3e428ec fix: place ug dep behind not wasm32 flag (#2760) Doug A 2025-02-01 18:05:52 -04:00
  • 43017539ab Adds DebertaV2/V3 (#2743) Brady Bonnette 2025-01-29 02:59:28 -05:00
  • e142bf9530 use moondream1 model/revision for moondream example (#2748) A.V. 2025-01-28 23:19:54 +02:00
  • d2c53f4f2f Remove the MFA gemm library. (#2755) Laurent Mazare 2025-01-28 21:48:17 +01:00
  • 2a2852d1c1 Fix flash-attn build. (#2754) Laurent Mazare 2025-01-28 18:49:46 +01:00
  • 8f20f2a722 Add the MLX merge sort kernels (#2751) Laurent Mazare 2025-01-28 14:09:43 +01:00
  • ab9019425a Make the metal sdpa tests deterministic. (#2750) Laurent Mazare 2025-01-28 09:05:24 +01:00
  • da02b59516 Allow using composed strings as metal kernel names. (#2747) Laurent Mazare 2025-01-27 22:40:12 +01:00
  • 053f941196 Typos. Laurent 2025-01-27 15:43:42 +01:00
  • c043e1ca10 Fixes all clippy warnings Brady Bonnette 2025-01-26 18:45:02 -05:00
  • cafad0d88d Adds DebertaV2/V3 Brady Bonnette 2024-08-01 15:40:30 -04:00
  • 27996a1a9e Remove the old MFA gemm kernels. (#2742) Laurent Mazare 2025-01-26 20:36:31 +01:00
  • 1a32107fab Add a few metal gather ops. (#2740) Laurent Mazare 2025-01-25 23:31:03 +01:00
  • 333d94a19a fix: fix the codegeex4 model examples and transformers model (#2738) 唐璜 2025-01-26 00:41:12 +08:00
  • 3164a19a5d Add inpainting to the stable diffusion example (#2735) mneilly 2025-01-23 01:08:38 -08:00
  • e6cd499e98 Fix candle-flash-attn build on Windows (msvc) (#2734) Sergei Grebnov 2025-01-22 13:19:48 -08:00
  • 77db8396d0 Explicit error when slice-set is called with the same src and dst. (#2733) Laurent Mazare 2025-01-22 21:31:49 +01:00
  • 85f0aaefe5 Add serde::serialize to activations. (#2732) Laurent Mazare 2025-01-22 10:23:34 +01:00
  • e4c3a71f11 Fix GLM4 alignment issue (#2723) Guoqing Bao 2025-01-21 05:51:46 +08:00
  • 17cbbe4286 Sync upstream MLX sdpa vector kernels with mask (#2718) Eric Buehler 2025-01-16 05:30:10 -05:00
  • 6fd2f63a15 Bump the ug dependency. (#2720) Laurent Mazare 2025-01-16 09:39:16 +01:00
  • efd0e6822f Fix the helium weights download. (#2717) Laurent Mazare 2025-01-13 18:21:37 +01:00
  • 158817f230 Helium repo update. (#2716) Laurent Mazare 2025-01-13 18:04:14 +01:00
  • 309cd0f7c7 Add the helium model. (#2715) Laurent Mazare 2025-01-13 17:39:49 +01:00
  • ab7ff7081e Fixes for running Phi-4 quantized. (#2714) Jani Monoses 2025-01-13 15:35:33 +02:00
  • 461e8c1685 ModernBERT model (#2713) Jani Monoses 2025-01-13 09:39:27 +02:00
  • 2344c4e4b8 Clippy fixes for 1.84. (#2710) Laurent Mazare 2025-01-10 10:15:15 +01:00
  • 32defdb7d5 Update cudarc. (#2708) Laurent Mazare 2025-01-08 15:10:23 +01:00
  • 236c35e578 Bump the caret version to 0.8.2. (#2703) 0.8.2 Laurent Mazare 2025-01-07 15:50:16 +01:00
  • 6f8351dfda add link to README (#2701) Andrei Fajardo 2025-01-04 17:07:30 -05:00
  • 57f41da13b Fix mistral attention on Metal (#2699) Luka Zakrajšek 2025-01-04 16:11:20 +01:00
  • cbaa0ad46f UniPC for diffusion sampling (#2684) Nick Senger 2025-01-01 12:34:17 -08:00
  • b12c7c2888 Update the hf-hub dependency to 0.4.0. (#2691) Laurent Mazare 2024-12-31 19:07:47 +01:00
  • 94ffc2ec6f Actually remove the default hf-hub cache path for glm. (#2696) Laurent Mazare 2024-12-31 11:00:44 +01:00
  • 7354afc673 Use the default hf-hub cache for glm. (#2695) Laurent Mazare 2024-12-31 10:55:45 +01:00
  • 2a705e6f37 Flash-Attn upgrade / SoftCap Candle-FlashAttn [3/n] (#2690) Michael Feil 2024-12-31 10:04:47 +01:00
  • a594ef669c Flash-Attn upgrade / SoftCap Candle-FlashAttn [2/n] (#2689) Michael Feil 2024-12-31 09:41:23 +01:00
  • 71cd6d5533 Flash-Attn upgrade / SoftCap Candle-FlashAttn [1/n] (#2688) Michael Feil 2024-12-31 09:32:22 +01:00
  • d60eba1408 Streamline the glm4 example. (#2694) Laurent Mazare 2024-12-31 09:21:41 +01:00
  • e38e2a85dd Fix a cuda warning. (#2693) Laurent Mazare 2024-12-31 09:06:10 +01:00
  • 460616fc84 Update README.org (#2670) jetsung 2024-12-30 18:32:02 +08:00
  • 91f1f019b1 Added XLMRobertaModel for Reranking (#2686) Akshay Ballal 2024-12-30 11:16:57 +01:00
  • cd639131f0 Fix bug in whisper transformer (#2681) mert-kurttutan 2024-12-24 13:58:21 +01:00
  • 11aa30be10 Fix Batcher iterator break when return_last_incomplete_batch and items.is_empty (#2654) (#2655) hhllhhyyds 2024-12-24 15:41:26 +08:00
  • 1be6b090c7 Fix position encodings for Pixtral (#2678) Amélie Royer 2024-12-23 13:22:35 +01:00
  • 62ced44ea9 Add a Context trait similar to anyhow::Context. (#2676) Laurent Mazare 2024-12-22 09:18:13 +01:00
  • 5c2f893e5a make DepthAnythingV2 more reusable (#2675) Edgar Riba 2024-12-21 12:06:03 +01:00
  • 67cab7d6b8 Bump the crate version to 0.8.1. (#2662) 0.8.1 Laurent Mazare 2024-12-07 17:03:53 +01:00
  • 1807be84f4 Change/bert encoder public (#2658) Justin Sing 2024-12-04 15:22:30 -05:00
  • 145aa7193c Add Nvembed v2 model (#2649) cdoko 2024-12-03 05:56:01 -04:00
  • 6f715f9256 add scatter add (#2656) zachcp 2024-12-01 12:39:38 -05:00
  • dba7a9c93e add u32 - U32 gather (#2653) zachcp 2024-11-30 17:18:07 -05:00
  • b52c2c6050 Clippy fixes for the cuda feature. (#2650) Laurent Mazare 2024-11-29 09:01:34 +01:00
  • 4f59ed38b0 Adds support for stella_en_v5 embedding model -400M variant (#2608) iskng 2024-11-29 00:01:08 -08:00
  • 54e7fc3c97 Lint fixes introduced with Rust 1.83 (#2646) Anubhab Bandyopadhyay 2024-11-29 03:30:21 +05:30
  • 23ed8a9ded Fix for whisper-microphone example failure if audio isn't chunk aligned (#2645) Adam Nelson 2024-11-27 22:35:11 +01:00
  • 21c686387c Onnx Support for Sign operation #2641 (#2642) Ionut Mihalcea 2024-11-26 23:10:09 +01:00
  • b4deb5c5a9 Provide a method to allow PTH files with state maps to be loaded. (#2639) zachcp 2024-11-26 16:52:53 -05:00
  • c12db594e3 fix typo (#2606) Andrei Fajardo 2024-11-23 02:40:00 -05:00
  • f86f4d6224 Tweak the CI to avoid running out of disk space. (#2630) Laurent Mazare 2024-11-19 04:32:36 +01:00
  • 3159f91b90 20241118 docs (#2629) zachcp 2024-11-18 22:07:07 -05:00
  • 1a0f9ccf16 Import the ggml_cuda_dp4a function. (#2628) Laurent Mazare 2024-11-19 03:41:34 +01:00
  • e86565624b Fix for clippy. (#2626) Laurent Mazare 2024-11-18 14:32:38 +01:00
  • 386fd8abb4 Module Docs (#2624) zachcp 2024-11-18 08:19:23 -05:00
  • 12d7e7b145 More Model Module Docs (#2623) zachcp 2024-11-17 14:27:24 -05:00
  • a3f200e369 Module Docs (#2620) zachcp 2024-11-16 03:09:17 -05:00
  • 00d8a0c178 Remove some unused macros. (#2618) Laurent Mazare 2024-11-15 16:46:55 +01:00
  • f689ce5d39 Documentation Pass for Models (#2617) zachcp 2024-11-15 02:30:15 -05:00
  • 0ed24b9852 Add max-all/min-all. (#2616) Laurent Mazare 2024-11-14 21:08:04 +01:00
  • 06350c31c7 Add some missing index-select metal kernels. (#2613) Laurent Mazare 2024-11-12 17:10:12 +01:00
  • 9453cc3095 Bump the crate version to 0.8.0. (#2612) 0.8.0 Laurent Mazare 2024-11-12 14:11:46 +01:00
  • 3769206583 Update docs (#2553) zachcp 2024-11-11 16:13:52 -05:00
  • e2b6b367fa Add some fast Metal MLX SDPA kernels (#2584) Eric Buehler 2024-11-05 03:28:00 -05:00
  • 6454597943 Improved launch config for layer-norm/rms-norm. (#2591) Laurent Mazare 2024-11-04 10:42:18 +01:00
  • 3fba2b5fc4 Add the SmolLM2 models. (#2595) Laurent Mazare 2024-11-03 17:11:12 +01:00
  • 10b2e693ff Add the SmolLM2 models. smollm Laurent 2024-11-03 16:42:02 +01:00
  • ee8beb7cba Add more testing for the fused layer/rms norm kernels. Laurent 2024-11-01 18:35:47 +01:00
  • 463ddac329 Merge remote-tracking branch 'origin/main' into faster-layer-norm Laurent 2024-11-01 18:11:07 +01:00
  • 530ab96036 Support Skip Layer Guidance (SLG) for Stable Diffusion 3.5 Medium (#2590) Czxck001 2024-11-01 10:10:40 -07:00