Commit Graph

  • 29e25c458d FastViT fixes. (#2452) Jani Monoses 2024-08-28 12:20:09 +03:00
  • aafa24ed93 Update cudarc to 0.12. (#2451) Laurent Mazare 2024-08-27 09:10:30 +01:00
  • c3b0757995 Testing ushort intermediate in case combo of async and f16/bf16 is the issue Ivar Flakstad 2024-08-24 16:59:33 +02:00
  • fdc2622686 fix: qwen2 lm_head loading #2443 (#2445) ilookee 2024-08-23 22:50:02 +08:00
  • ccdbe87639 Add FastViT model. (#2444) Jani Monoses 2024-08-23 17:06:54 +03:00
  • 2ec8729d51 Fix for parler-tts, do not add the last slice of padding tokens. (#2442) Laurent Mazare 2024-08-22 22:22:03 +01:00
  • e3c146ada6 silero-vad v5 example (#2321) shua 2024-08-22 22:50:42 +02:00
  • 1e96b8b695 onnx: support negative index in Gather (#2440) shua 2024-08-22 15:28:25 +02:00
  • a8288b7a72 onnx: workaround pow with negative base (#2439) shua 2024-08-22 13:34:53 +02:00
  • 6070278a31 Bump the version to 0.6.1. (#2438) Laurent Mazare 2024-08-22 08:23:52 +01:00
  • 560f666d29 Alter metal simdgroup matrix load/store ops Ivar Flakstad 2024-08-21 08:57:50 +02:00
  • b47c0bc475 Update README.md (#2435) Laurent Mazare 2024-08-19 08:34:24 +01:00
  • 14fd2d97e0 Add a readme for the parler-tts example. (#2434) Laurent Mazare 2024-08-19 08:30:12 +01:00
  • 31a1075f4b onnx: implement LSTM op (#2268) shua 2024-08-19 09:06:17 +02:00
  • 236b29ff15 Add the DAC model. (#2433) Laurent Mazare 2024-08-19 07:59:51 +01:00
  • 58197e1896 parler-tts support (#2431) Laurent Mazare 2024-08-18 19:42:08 +01:00
  • 736d8eb752 Stream tensor (#2429) Laurent Mazare 2024-08-17 20:54:28 +01:00
  • 7cff5898ec Support Minus(u) for arbitrary values of u, e.g. Minus(3). (#2428) Laurent Mazare 2024-08-17 20:29:01 +01:00
  • b75ef051cf Fix the marian tokenizer importer. (#2426) Laurent Mazare 2024-08-17 19:58:40 +01:00
  • c1b9e07e35 Add support for gemma-2. (#2425) Laurent Mazare 2024-08-17 19:31:23 +01:00
  • 69fdcfe96a Apply rustfmt. (#2421) Laurent Mazare 2024-08-16 17:57:14 +01:00
  • 2b75dd9551 Fix build issue in EOS Token in llama-multiprocess (#2420) Hadi 2024-08-16 12:46:31 -04:00
  • 53ce65f706 Clippy fixes. (#2415) Laurent Mazare 2024-08-14 09:13:53 +01:00
  • 68aa9c7320 Fix the device for the bert attention mask. (#2414) Laurent Mazare 2024-08-14 09:01:12 +01:00
  • 35e5f31397 Add Based LLM from Hazy Research. (#2411) Jani Monoses 2024-08-12 22:21:19 +03:00
  • d3fe989d08 Add documentation examples for Tensor::i and Tensor::narrow methods (#2308) Carsten Csiky 2024-08-10 08:11:09 +02:00
  • 14db029494 Soft Non-Maximum Suppression (#2400) Matthew O'Malley-Nichols 2024-08-09 22:57:52 -07:00
  • 6e6c1c99b0 Fix issues in the encodec example README.md (#2407) Joel Nises 2024-08-10 07:49:05 +02:00
  • 16b49dfd89 slight changes to async ops Ivar Flakstad 2024-08-08 12:30:22 +02:00
  • fc0deede31 See if M1/M2 likes metal sources better than metallibs compiled on M3 Ivar Flakstad 2024-08-08 09:32:13 +02:00
  • e178aacead Re-revert the reverted revision (bf16 gemm metal) Ivar Flakstad 2024-08-06 12:37:36 +02:00
  • b7d9af00cc fix: usage of actions/checkout@v2 (#2403) Hamir Mahal 2024-08-06 01:59:34 -07:00
  • 59bbc0d287 Add the import script for the T5 tokenizer. (#2399) Laurent Mazare 2024-08-05 20:03:31 +01:00
  • dfdce2b602 Add the MMDiT model of Stable Diffusion 3 (#2397) Czxck001 2024-08-05 10:26:15 -07:00
  • 500c9f2882 add models support and example for THUDM/glm-4 (#2362) 唐璜 2024-08-05 11:48:09 -04:00
  • bb191c25d5 Update metallib with device matrix offsets Ivar Flakstad 2024-08-05 15:05:00 +02:00
  • 2be9bd211e Support for mistral-nemo. (#2396) Laurent Mazare 2024-08-04 18:52:40 +01:00
  • 89eae41efd Support the flux-dev model too. (#2395) Laurent Mazare 2024-08-04 11:16:24 +01:00
  • c0a559d427 optimize gradient for silu a bit (#2393) MilkFather 2024-08-04 17:24:17 +08:00
  • aa7ac1832d Simplify handling of flux modulations. (#2394) Laurent Mazare 2024-08-04 10:09:54 +01:00
  • 19db6b9723 Add the flux model for image generation. (#2390) Laurent Mazare 2024-08-04 07:14:33 +01:00
  • 0fcb40b229 Revert the bf16 gemm metal changes for now. (#2386) Laurent Mazare 2024-08-01 22:08:47 +01:00
  • 6991a37b94 update: LSTMState and GRUState fields to be public (#2384) Justin Sing 2024-08-01 10:30:32 -04:00
  • 9ca277a9d7 Fix cargo fmt. (#2383) Laurent Mazare 2024-08-01 13:19:41 +01:00
  • 2e9c010609 Jina Bert Example fix and more configuration (#2191) Joan Fontanals 2024-08-01 06:59:20 -05:00
  • ac51f477eb Add Hiera vision model. (#2382) Jani Monoses 2024-08-01 12:59:22 +03:00
  • d4b6f6eef6 Add a minimal test for the metal bf16 matmul. (#2381) Laurent Mazare 2024-08-01 10:22:46 +01:00
  • 957d604a78 Enable BF16 on metal. (#2380) Laurent Mazare 2024-08-01 10:05:07 +01:00
  • ce90287f45 Add get_ids to GradStore (#2379) Takanori MAEHARA 2024-08-01 09:56:13 +01:00
  • 1ba87a9450 Use BF16 on metal when possible. (#2378) Laurent Mazare 2024-08-01 09:48:58 +01:00
  • bd80078acf Fix log_sum_exp to handle large positive/negative inputs (#2367) Yun-Jhong Wu 2024-08-01 03:37:02 -05:00
  • fea46cb719 Metal bgemm min changes (#2364) ivarflakstad 2024-08-01 16:06:04 +08:00
  • 8696cf6494 Enable the affine kernel for u8/u32. (#2376) Laurent Mazare 2024-08-01 09:03:11 +01:00
  • 4a52aeb437 bert attention mask (#1934) Zheng Li 2024-08-01 14:26:19 +08:00
  • 24d54d0ff9 Bump image crate version so ImageReader is available without aliasing (#2365) ivarflakstad 2024-07-29 23:41:33 +08:00
  • 636eff652a change DTypes (fixes #2355) (#2363) Jacob Marshall 2024-07-28 13:36:05 +01:00
  • 0f5cbb08b3 Add support for Llama 3.1 (#2359) Eric Buehler 2024-07-26 15:32:26 -04:00
  • 9105aa4390 batched gemm work metal-gemm Ivar Flakstad 2024-07-26 18:53:58 +02:00
  • ddafc61055 Use RAII for terminating the encoding. (#2353) Laurent Mazare 2024-07-24 15:29:56 +01:00
  • a925ae6bc6 Use a trait for the encoder provider (so that encoder can ultimately be reused). (#2352) Laurent Mazare 2024-07-24 08:27:30 +01:00
  • 6056fd5c90 onnx: fix pad, unsqueeze (#2317) shua 2024-07-23 23:10:57 +02:00
  • ebc9aa60bc fix clip example title (#2345) Caio Petrucci Rosa 2024-07-23 17:55:18 -03:00
  • 2489a606fe feat(candle-transformers/models/codegeex4-9b): add codegeex4-9 (#2334) donjuanplatinum 2024-07-21 19:00:41 +08:00
  • 3c815b1dca Pin the revision used by moondream. (#2340) Laurent Mazare 2024-07-18 09:49:46 +01:00
  • 42891cc613 Add mathstral in the examples. (#2339) Laurent Mazare 2024-07-18 07:24:49 +01:00
  • f25173d68b Fix for backprop in ConvTranspose2D with stride of 2 (#2337) Ivor Wanders 2024-07-17 13:22:23 -04:00
  • 2a2a349fd4 Rustier ffi impls Ivar Flakstad 2024-07-17 13:41:26 +08:00
  • 6a4741bbf9 Fix Elu gradient NaN on large input (#2328) Alexey Gerasev 2024-07-16 19:41:16 +07:00
  • c87dd386a9 Get mac gpu core count via ffi Ivar Flakstad 2024-07-16 20:25:34 +08:00
  • 30cdd769f9 Update the flash attn kernels. (#2333) Laurent Mazare 2024-07-15 20:37:36 +02:00
  • d74fbed334 Pinning cudarc to 0.11.6 (#2332) Josh Collyer 2024-07-15 14:29:08 +01:00
  • f4b1597b5d gemm impl translation Ivar Flakstad 2024-07-15 19:18:21 +08:00
  • c63048d374 add quantized qwen2 (#2329) Zhuo Jinggang 2024-07-12 16:00:03 +08:00
  • ea578478d4 Initial generic metallib build.rs script Ivar Flakstad 2024-07-11 18:00:03 +08:00
  • a226a9736b Add Mobilenet v4 (#2325) Jani Monoses 2024-07-09 14:52:20 +03:00
  • 25960676ca Add a basic metal example with capture (#2324) Laurent Mazare 2024-07-09 12:38:11 +02:00
  • 9cd54aa5d4 Add EVA-02 model ( https://arxiv.org/abs/2303.11331 ) (#2311) v-espitalier 2024-07-07 20:09:31 +02:00
  • eec11ce2ce onnx: implement Size op (#2316) shua 2024-07-07 19:56:36 +02:00
  • 9182f9f5c2 ignore editor config folders (#2315) shua 2024-07-07 19:43:48 +02:00
  • ecff05d72b Beit: Add the gen_relative_position_index() function (#2306) v-espitalier 2024-07-04 09:45:26 +02:00
  • 7f1ba8038c Add Beit model ( https://arxiv.org/abs/2106.08254 ) (#2305) v-espitalier 2024-07-01 22:11:48 +02:00
  • 74e9e41911 make up for the missing last token output of phi2 example (#2299) Czxck001 2024-06-29 12:34:42 -07:00
  • e27aac0a06 Add DINOv2Reg4 + PlantCLEF2024 (#2293) v-espitalier 2024-06-29 11:49:15 +02:00
  • a3dd87f15e Adding Gemm and ArgMax operators to candle-onnx (#2231) 0.6.0 drCathieSo.eth 2024-06-29 03:40:31 +08:00
  • 242e006bbb Depth Anything v2 (#2279) Jeroen Vlek 2024-06-24 19:12:52 +02:00
  • 6baa1d486b Fix a bug in the metal implemtation of col2im1d. (#2284) Laurent Mazare 2024-06-22 23:21:20 +02:00
  • 36cf54525d Fix the fast bf16 gemm cublas kernels. (#2274) Laurent Mazare 2024-06-18 23:46:58 +02:00
  • 2b10aaa05d implement Slice op (#2260) shua 2024-06-12 08:15:32 +02:00
  • 9f804af29d feat(ci): add trufflehog secrets detection (#2262) Luc Georges 2024-06-10 22:03:54 +02:00
  • 54ff971e35 Support for the new Qwen2 models. (#2257) Laurent Mazare 2024-06-07 10:51:50 +01:00
  • b9fac7ec00 implement if, and pad reflect mode (#2251) shua 2024-06-06 22:36:23 +02:00
  • f65e90e7ef Bump the crate version. (#2248) Laurent Mazare 2024-06-05 15:49:15 +02:00
  • d39462856b Apply rustfmt. (#2247) Laurent Mazare 2024-06-04 22:54:09 +02:00
  • cb180eb23a ONNX: add ArgMin, ArgMax and LeakyRelu (#2246) B1rtek 2024-06-04 22:49:02 +02:00
  • 56a1b7d97e Apply rustfmt. operators-argmin-argmax-leakyrelu Laurent 2024-06-04 22:47:20 +02:00
  • 47c7ecc948 Merge branch 'refs/heads/leaky_relu' into operators-argmin-argmax-leakyrelu b1rtek 2024-06-04 21:13:38 +02:00
  • c441716bd2 Fix a weird automatic RustRover change b1rtek 2024-06-04 21:13:30 +02:00
  • a5b81e2c02 Merge branch 'refs/heads/argmin-argmax' into operators-argmin-argmax-leakyrelu b1rtek 2024-06-04 21:09:59 +02:00
  • 9182c828e6 Automatically upcast for to_u64 (#2244) Eric Buehler 2024-06-04 05:32:36 -04:00
  • 3f13ad3d79 Fix dataset id for MNIST (#2238) Taylor Ninesling 2024-06-03 23:27:24 -05:00