Commit Graph

  • 17313a4226 Fix cuda memory error for Qwen3 non-quantized (#2987) main Akshay Ballal 2025-06-07 16:02:58 +02:00
  • 0224a749f0 Add Qwen3 MoE (#2934) Kyle Birnbaum 2025-05-31 06:33:28 -07:00
  • cd7b877d6b candle-onnx: Implement Trilu and ScatterND ops (#2952) Kyle Birnbaum 2025-05-29 22:36:09 -07:00
  • 5aed817f1b feat: enhance linear algebra operations (#2972) 飘尘 2025-05-29 15:41:01 +08:00
  • 1a183c988a Add fine-tuned text classifier to xlm roberta example (#2969) Jon Eskin 2025-05-28 00:17:07 -04:00
  • cac51fe16a (hotfix) fix the doc test for indexer (#2970) Congxian Qiu 2025-05-28 12:13:26 +08:00
  • 61ddb9535e Use a tanh activation in the xlm-roberta classification head. (#2968) Laurent Mazare 2025-05-26 08:54:31 +02:00
  • 9a62c91643 Proper support for phi-4 (#2960) Laurent Mazare 2025-05-21 10:18:33 +02:00
  • ed353eb76d revert some changes hf-papers Quentin Gallouédec 2025-05-17 03:46:18 +00:00
  • ffb8d63324 Use HF Papers Quentin Gallouédec 2025-05-17 03:41:24 +00:00
  • 92106c8762 Fixes for clippy 1.87. (#2956) Laurent Mazare 2025-05-15 21:50:27 +02:00
  • 9ce4fe6194 Fix docs quantized qwen3 (#2955) MaCAT 2025-05-15 14:58:03 +09:00
  • 450a49ed1a Olmo 2 model (#2954) Jani Monoses 2025-05-14 20:18:02 +03:00
  • 6bd61727bc Make tensor contiguous before the repeat_kv calls to avoid strided copies (#2953) Borek Požár 2025-05-14 10:47:28 +02:00
  • 485ddf2996 Fixed Quantized Qwen3 Model (#2951) Snake 2025-05-13 11:53:42 +08:00
  • 36508a2c93 Add Resize to onnx ops (#2946) Kyle Birnbaum 2025-05-09 22:05:03 -07:00
  • 3d05f5cf3d Qwen3 quantized implementation (#2939) Lucien Thomas 2025-05-08 08:06:10 -05:00
  • 5ed764213d Add dtype size to benchmark throughput calculation metal-fp8-fun Ivar Flakstad 2025-05-06 10:39:02 +02:00
  • 816aeeb7b6 Simplify casting via "mlx_cast" Ivar Flakstad 2025-05-06 10:23:24 +02:00
  • 6210fbe9d8 mlx fp8 gemm Ivar Flakstad 2025-05-06 09:55:45 +02:00
  • 637473cb5e Bump cudarc to 0.16.3. (#2942) Laurent Mazare 2025-05-04 09:14:28 +02:00
  • e27b4700ad Indexing with max-value results in zero/no-op. (#2940) Laurent Mazare 2025-05-03 11:36:31 +02:00
  • 1fdfb58de5 Updating Add qwen3 (PR 2903) to use HF weights (#2930) Kyle Birnbaum 2025-05-01 21:05:53 -07:00
  • cd96fa80da Add a scattered kv cache. (#2936) 0.9.1 Laurent Mazare 2025-05-01 10:20:48 +02:00
  • 8a19bb7df2 Bump the candle version to 0.9.1. (#2935) Laurent Mazare 2025-05-01 10:08:16 +02:00
  • 38fc86621c Add support for Helium-v1. (#2932) Laurent Mazare 2025-04-30 19:38:44 +02:00
  • 5029ac52bb Added tracing page to the candle book. (#2922) Kyle Birnbaum 2025-04-29 12:35:36 -07:00
  • de23d34a28 Switch Tensor::full to return a contiguous tensor. (#2929) Laurent Mazare 2025-04-28 21:36:39 +02:00
  • d4bac37a61 Fix the gumbel softmax by casting to f32. (#2928) Laurent Mazare 2025-04-28 19:48:51 +02:00
  • e98754fc5a Optimize Tensor::new when called on nested Vec<..>. (#2927) Laurent Mazare 2025-04-28 09:19:45 +02:00
  • e3db30021f Support for "unbatched" rope. (#2926) Laurent Mazare 2025-04-27 15:12:02 +02:00
  • 6e0646c208 Remove redundant mlx gemm dtype check (#2925) ivarflakstad 2025-04-27 06:14:57 +02:00
  • fbaf0b0e32 Bump the crate version to 0.9.0. (#2924) 0.9.0 Laurent Mazare 2025-04-26 11:01:21 +02:00
  • a2e925462c Add the scatter in place ops. (#2923) Laurent Mazare 2025-04-26 07:36:49 +02:00
  • 3827685524 Add the scatter op. (#2921) Laurent Mazare 2025-04-25 21:46:58 +02:00
  • 3aeb9575c7 Fixed Quantized Gemma3 Model and example (#2918) Kyle Birnbaum 2025-04-24 20:47:48 -07:00
  • 6ff0a6999c Fixed Gemma3 model and example (#2917) Kyle Birnbaum 2025-04-24 20:35:08 -07:00
  • 82def7ae38 Cudarc update. (#2915) Laurent Mazare 2025-04-23 07:03:26 +02:00
  • 99bd69f383 fixed quantized-gemma example (#2914) Kyle Birnbaum 2025-04-22 20:39:03 -07:00
  • a4c56a958e Add the const-set op. (#2910) Laurent Mazare 2025-04-19 10:07:02 +02:00
  • b2904a830b implemented quantized-gemma3 (#2902) Kyle Birnbaum 2025-04-18 22:46:41 -07:00
  • 21055b5697 Add PRelu operation (#2904) A2va 2025-04-19 07:24:10 +02:00
  • 9dbaf958dc Add an enum for scalar values. (#2909) Laurent Mazare 2025-04-18 22:13:38 +02:00
  • ce5f8dd129 Check the bounds in the cuda indexing kernels. (#2908) Laurent Mazare 2025-04-18 20:08:17 +02:00
  • 3b24f8f302 Add metal precompilation via build.rs metal-precompile Ivar Flakstad 2025-04-17 15:56:52 +02:00
  • 9954981327 Allow from_vec/from_slice to use a ShapeWithOneHole as shape. (#2905) Laurent Mazare 2025-04-17 08:59:18 +02:00
  • 7f0f83a7c1 Rotating kv cache positions (#2901) 0.9.0-alpha.4 Laurent Mazare 2025-04-15 23:09:26 +02:00
  • 76e565c4ab Updated candle-book: Introduction, Installation, MNIST guide, and added CONTRIBUTING.md (#2897) Kyle Birnbaum 2025-04-15 12:41:10 -07:00
  • e4e7b0b2da Use cudarc 0.16. (#2900) Laurent Mazare 2025-04-15 21:40:18 +02:00
  • 6381023982 Adding cuda feature for easier integration with extensions. tei_cudarc_freedom Nicolas Patry 2025-04-15 16:28:51 +02:00
  • b01ebbad8a Use cudarc 0.15.2. (#2896) Laurent Mazare 2025-04-14 20:47:52 +02:00
  • 1d1d6d4fe6 Bump the crate version. (#2895) 0.9.0-alpha.3 Laurent Mazare 2025-04-14 15:52:11 +02:00
  • 2653002f29 Gumbel-Softmax sampling. (#2894) Laurent Mazare 2025-04-14 15:42:42 +02:00
  • a52b76ae82 Expose the cudnn algo in the conv ops. (#2892) Laurent Mazare 2025-04-14 08:25:32 +02:00
  • 8e62723b2d Set the algo. conv1d-algo laurent 2025-04-13 20:58:18 +02:00
  • fb660b8d43 Add a cudnn feature to candle-nn/candle-transformers. (#2890) 0.9.0-alpha.2 Laurent Mazare 2025-04-13 17:43:41 +02:00
  • 2f9606b187 Exclude candle-book to avoid some CI failures. (#2889) Laurent Mazare 2025-04-13 17:11:41 +02:00
  • 83bbbc6265 Deploy f3a73f80d1 to gh-pages gh-pages Deploy from CI 2025-04-13 14:47:49 +00:00
  • f3a73f80d1 Support for cudnn conv1d. (#2888) Laurent Mazare 2025-04-13 16:47:37 +02:00
  • b44d38de0e Add the Orpheus TTS. (#2886) Laurent Mazare 2025-04-13 12:02:17 +02:00
  • d9198deb37 Im2col cuda optimization. (#2885) Laurent Mazare 2025-04-13 10:07:53 +02:00
  • 15ed0b11ce Optimize the batched matmul for the cpu backend. (#2884) Laurent Mazare 2025-04-12 21:40:40 +02:00
  • 34505fdf3a Avoid using batched-matmul in nn::Linear. (#2883) Laurent Mazare 2025-04-12 19:53:58 +02:00
  • d7b7ce16e4 Upgrade ug. (#2882) Laurent Mazare 2025-04-12 13:19:32 +02:00
  • 19fb6dac1f Bump the crate version. (#2881) Laurent Mazare 2025-04-11 22:28:21 +02:00
  • acc5bd335f Cuda cleanup. (#2880) Laurent Mazare 2025-04-11 21:43:35 +02:00
  • 543b5b5898 Update for the latest cudarc. cuda-graph-exp laurent 2025-04-11 14:02:41 +02:00
  • c87f0fa5d6 Merge remote-tracking branch 'origin/main' into cuda-graph-exp laurent 2025-04-11 13:47:35 +02:00
  • eb478ece92 Implementing DistilBertForMaskedLM. (#2866) Kyle Birnbaum 2025-04-11 04:25:39 -07:00
  • d339b01726 Fix hardcoded f32 dtype for attention_mask. Use the model dtype for compatibility. (#2872) Manpreet Singh 2025-04-08 00:12:14 -04:00
  • 2f3bf42bcb Support more snac variants. (#2871) Laurent Mazare 2025-04-07 08:23:47 +02:00
  • e3370c6316 Add the SNAC audio tokenizer. (#2869) Laurent Mazare 2025-04-06 22:15:36 +02:00
  • 338f6a102e Clippy 1.86 fixes for cuda. (#2868) Laurent Mazare 2025-04-05 15:45:35 +02:00
  • bc33df77e1 Add the missing voices for CSM. (#2867) Laurent Mazare 2025-04-05 06:52:36 +02:00
  • cf9d7bf24c Add the CSM model. (#2862) 0.9.0-alpha.1 Laurent Mazare 2025-04-04 06:48:03 +02:00
  • 9d31361c4f Fix for clippy 1.86. (#2864) Laurent Mazare 2025-04-03 19:38:27 +02:00
  • 5341bf4cd5 Fixes for clippy 1.86. fix-1.86 laurent 2025-04-03 19:30:20 +02:00
  • 8977c31b6d Generate some audio file. laurent 2025-04-03 19:16:49 +02:00
  • 3be12b8b50 Autoregressive generation. laurent 2025-04-03 18:38:00 +02:00
  • 825119ac4b Rope fix. laurent 2025-04-03 18:01:25 +02:00
  • e319cd78d9 Get the sampling to work. laurent 2025-04-03 14:58:44 +02:00
  • 3fb67e0c2c Add frame generation. laurent 2025-04-03 13:41:16 +02:00
  • d72c44705c Load the text tokenizer. laurent 2025-04-03 12:25:41 +02:00
  • 2203f0e3c9 Add some code to load the model. laurent 2025-04-03 12:20:21 +02:00
  • 01e895c1aa Add the CSM model. laurent 2025-04-03 12:04:44 +02:00
  • 648596c073 Added readmes to examples (#2835) Kyle Birnbaum 2025-04-03 00:18:29 -07:00
  • d9904a3baf Update to cudarc 0.14 (breaking change). (#2858) Laurent Mazare 2025-04-03 09:12:19 +02:00
  • d6db305829 Added new language pairs to marian-mt example. (#2860) Kyle Birnbaum 2025-04-02 14:50:14 -07:00
  • b4daa03e59 add as_cuda_slice_mut to CudaStorage and CudaDType (#2859) Zack Angelo 2025-04-01 12:34:52 -05:00
  • 9541467d6b Add flip to tensor (#2855) Bryan Lee 2025-04-01 03:07:16 -04:00
  • 6429609090 Added Deepseekr1 Llama8b variant to quantized example (#2842) Kyle Birnbaum 2025-03-30 01:55:21 -07:00
  • ba473290da Added DeepseekR1 Qwen7B variant to quantized-qwen2-instruct example (#2843) Kyle Birnbaum 2025-03-30 01:54:22 -07:00
  • 59c26195db Fix CIFAR10 dataset types and dimension ordering (#2845) Bryan Lee 2025-03-30 04:53:25 -04:00
  • ec6d7ca773 Cudarc static-linking enabled. cudarc_freedom Nicolas Patry 2025-03-29 09:27:53 +01:00
  • 2c0f6b008e Fixing order. mkl_link_freedom Nicolas Patry 2025-03-28 11:43:33 +01:00
  • 9862cd3ba2 Splitting the features to enable different mkl linking. Nicolas Patry 2025-03-28 10:13:13 +01:00
  • 2e273ddf31 Fixing the mkl dependency hell. fix_mkl_feature Nicolas Patry 2025-03-27 18:01:21 +01:00
  • cb02b389d5 Fix reinforcement learning example (#2837) LongYinan 2025-03-26 08:27:45 -07:00
  • 0d4097031c fixed rand import for mnist-training (#2833) Kyle Birnbaum 2025-03-26 00:10:03 -07:00
  • 10853b803c fixed rand imports for whisper-microphone example (#2834) Kyle Birnbaum 2025-03-26 00:09:27 -07:00