Commit Graph

  • 9694671bbf Not implementing quantized. Nicolas Patry 2024-01-15 18:00:43 +01:00
  • 3dbf65ef20 Rebase after phi2 merge + fix replit default to CPU. Nicolas Patry 2024-01-15 17:52:49 +01:00
  • b2db5adf82 Bad code removal. Nicolas Patry 2024-01-15 17:25:06 +01:00
  • 9ef040338d After rebase. Nicolas Patry 2024-01-11 17:17:55 +01:00
  • 3aefc709c7 Cleanup the fence. Nicolas Patry 2024-01-05 21:57:07 +01:00
  • c8c603ce96 Removing the fences speeds everything up and *is* correct this time... Nicolas Patry 2024-01-05 19:26:30 +01:00
  • 61ad8d91cc Fix the rebase. Nicolas Patry 2024-01-05 14:31:39 +01:00
  • 2cd1e59c9e Cleanup. Nicolas Patry 2024-01-05 14:28:08 +01:00
  • 9c4b4f0da0 Metal quantized modifications proposal. Nicolas Patry 2023-11-18 14:49:30 +01:00
  • 5637f86040 Update yew requirement from 0.20.0 to 0.21.0 dependabot/cargo/yew-0.21.0 dependabot[bot] 2024-01-15 12:25:36 +00:00
  • ea36f3b11f Use the new phi model by default. (#1589) Laurent Mazare 2024-01-15 12:30:27 +01:00
  • 79478ff5a1 Seed should be updated by random kernel result. Ivar Flakstad 2024-01-14 18:10:54 +01:00
  • 86b7c01b30 Update gemm to the latest version. (#1587) Laurent Mazare 2024-01-15 09:44:51 +01:00
  • bdd8107fda Expose the ndarray trait. (#1586) Laurent Mazare 2024-01-14 20:09:49 +01:00
  • ecf88a6d38 Merge branch 'main' into ivarflakstad/metal-prng Ivar Flakstad 2024-01-14 17:10:54 +01:00
  • e6d86b0819 Add the pow operator. (#1583) Laurent Mazare 2024-01-13 20:24:06 +01:00
  • 88618255cb Fix the rotary embeddings for the new phi implementation. (#1582) Laurent Mazare 2024-01-13 19:44:41 +01:00
  • 539ead927a Update the Phi model to use the updated architecture. (#1580) Laurent Mazare 2024-01-13 17:38:27 +01:00
  • a46864bd56 Fix "Minimal Mamba" link in README. (#1577) SebastianRueClausen 2024-01-12 17:47:07 +01:00
  • bafe95b660 Fix format. (#1576) Nicolas Patry 2024-01-12 14:23:17 +01:00
  • a3d92ab226 Metal: Activate bfloat affine and add benchmark (#1543) ivarflakstad 2024-01-12 11:19:49 +01:00
  • e90bcdcc7c Metal: f16 and bf16 where_cond + benchmark (#1545) ivarflakstad 2024-01-12 11:18:11 +01:00
  • 8e06bfb4fd Mention VGG in the readme. (#1573) Laurent Mazare 2024-01-12 09:59:29 +01:00
  • 6242276c09 Pin the revision used for phi-v2 + make it the default. (#1572) Laurent Mazare 2024-01-12 09:19:30 +01:00
  • e06e8d0dbe fmt Ivar Flakstad 2024-01-12 07:26:42 +01:00
  • 36ce0988c0 Merge branch 'main' into ivarflakstad/metal-fill Ivar Flakstad 2024-01-12 07:25:32 +01:00
  • e63bb8661b Merge branch 'main' into ivarflakstad/metal-prng Ivar Flakstad 2024-01-12 07:19:58 +01:00
  • 41915184bb Bugfix for dequantizing q5k layers. (#1569) Laurent Mazare 2024-01-11 23:15:11 +01:00
  • c1876b8041 Merge pull request #1567 from bayedieng/close-ifdef ivarflakstad 2024-01-11 22:14:38 +01:00
  • 85e5680277 remove metal version check Baye Dieng 2024-01-11 21:02:03 +00:00
  • 1327419776 close ifdef Baye Dieng 2024-01-11 17:14:12 +00:00
  • 402349d120 feat(bf16): add cast support + tests for cast + bin ops (#1524) Kyle McCarthy 2024-01-11 08:49:13 -06:00
  • 9f0c99f0c1 Seperate benchmarks by enabled features (#1538) ivarflakstad 2024-01-11 15:35:38 +01:00
  • 0fc95c9f0c Add a dequantize command to tensor-tools. (#1565) Laurent Mazare 2024-01-11 11:21:01 +01:00
  • 2480c5dbdd Add RepVGG model. (#1561) Jani Monoses 2024-01-11 08:07:40 +02:00
  • 63944714f2 Use candle_nn::embedding instead of local copies in a few models. (#1562) Jani Monoses 2024-01-10 22:36:27 +02:00
  • d3bdd788cf Use __HAVE_BFLOAT__ to check for bfloat support instead of metal version check (#1540) ivarflakstad 2024-01-10 18:50:30 +01:00
  • ae06cb74bb Add relu kernel for metal (#1488) Juarez Bochi 2024-01-10 12:27:17 -05:00
  • a897fda74e Update memmap2 requirement from 0.7.1 to 0.9.3 (#1556) dependabot[bot] 2024-01-10 16:27:59 +01:00
  • 1f1179913a Update gloo requirement from 0.8 to 0.11 (#1558) dependabot[bot] 2024-01-10 16:27:20 +01:00
  • 6e98cf2a92 Update cudarc requirement from 0.9.14 to 0.10.0 (#1559) dependabot[bot] 2024-01-10 16:27:05 +01:00
  • 2cc1247999 Update tokenizers requirement from 0.13.4 to 0.15.0 (#1555) dependabot[bot] 2024-01-10 16:26:53 +01:00
  • cdbdb4af9c Update yew-agent requirement from 0.2.0 to 0.3.0 dependabot/cargo/yew-agent-0.3.0 dependabot[bot] 2024-01-10 14:14:03 +00:00
  • edf3fcd1c4 fix: deprecated option field (open-pull-requests-limit-per-dependency) (#1554) darker 2024-01-10 15:12:46 +01:00
  • 53e4755015 feat: add dependabot to the project (#1553) darker 2024-01-10 14:57:20 +01:00
  • 87efb5d8eb Updated feature separated benchmarks Ivar Flakstad 2024-01-09 19:04:31 +01:00
  • ad181f9cdc Merge branch 'ivarflakstad/seperate-benchmarks-by-feature' into ivarflakstad/metal-prng Ivar Flakstad 2024-01-09 18:55:40 +01:00
  • 45936a18f8 Update with feature separated benchmarks Ivar Flakstad 2024-01-09 18:54:48 +01:00
  • 4462198bc1 Merge branch 'ivarflakstad/seperate-benchmarks-by-feature' into ivarflakstad/metal-fill Ivar Flakstad 2024-01-09 18:48:51 +01:00
  • 88945f2c22 Improve benchmarks layout Ivar Flakstad 2024-01-09 18:31:28 +01:00
  • 12b2a337f3 Handle start-offset when loading a tensor from a pickle file. (#1546) Laurent Mazare 2024-01-08 09:20:48 +01:00
  • fb05af4c42 Avoid some unnecessary returns. Laurent 2024-01-08 07:19:59 +01:00
  • ad075a5f7e Remove allow pragma Ivar Flakstad 2024-01-08 06:48:33 +01:00
  • c2261d0222 Merge. bug_q4k Laurent 2024-01-07 20:27:33 +01:00
  • 0eb90ed783 Simpler repro for the neon optimization issue + bugfix (#1544) Laurent Mazare 2024-01-07 20:21:49 +01:00
  • 89b5a06858 Use bindgen-cuda for the custom-kernel example. (#1536) Laurent Mazare 2024-01-07 17:18:46 +01:00
  • 3f04a79ada Use cfg to seperate benchmark results based on features Ivar Flakstad 2024-01-07 14:40:15 +01:00
  • 30313c3081 Moving to a proper build crate bindgen_cuda. (#1531) Nicolas Patry 2024-01-07 12:29:24 +01:00
  • e72d52b1a2 Unpin more of the workplace relative dependencies. (#1535) Laurent Mazare 2024-01-07 12:26:20 +01:00
  • b4cb982e49 Simplifying our internal cargo dependencies. (#1529) Nicolas Patry 2024-01-07 12:04:14 +01:00
  • 6ebe043273 Merge branch 'main' into ivarflakstad/metal-prng Ivar Flakstad 2024-01-07 11:52:03 +01:00
  • 6bf52b9fdf Gaussian normal distribution of PRNG via Box-Muller transform Ivar Flakstad 2024-01-05 21:18:12 +01:00
  • 06d186355b Change more consitently the test. Nicolas Patry 2024-01-06 15:20:55 +01:00
  • 2bbd544832 Non random for better quantization quality Nicolas Patry 2024-01-06 15:16:01 +01:00
  • 84250bf52f fix index_pos bug when kv cache is disabled. (#1517) optman 2024-01-06 18:43:01 +08:00
  • 9cd0cc1f65 Ignore rotary for mistral. tmp_no_rotary Nicolas Patry 2024-01-05 21:55:13 +01:00
  • 12fc4af8f2 Ignore rotary. Nicolas Patry 2024-01-05 21:38:39 +01:00
  • 9130b6c4b6 Removing the fences speeds everything up and *is* correct this time... Nicolas Patry 2024-01-05 19:26:30 +01:00
  • 8d1a57c9a0 chore: update flash attention kernels (#1518) OlivierDehaene 2024-01-05 18:28:55 +01:00
  • 7b4389099a Fix the rebase. Nicolas Patry 2024-01-05 14:31:39 +01:00
  • 6f8584091e Cleanup. Nicolas Patry 2024-01-05 14:28:08 +01:00
  • f97fcd4712 Metal quantized modifications proposal. Nicolas Patry 2023-11-18 14:49:30 +01:00
  • 504d0b9ac7 Potential bug on q4k. Nicolas Patry 2024-01-05 14:13:22 +01:00
  • 955e63c803 Implement hybrid Tausworthe + LCG psuedo random number generator in metal Ivar Flakstad 2024-01-05 13:27:59 +01:00
  • 3a7304cb0d add link to gpt-from-scratch-rs (#1525) Jeroen Vlek 2024-01-05 11:59:46 +01:00
  • fa3ea98ba9 Adding bfloat16 support for the cast kernels. (#1520) Nicolas Patry 2024-01-04 12:12:56 +01:00
  • e8e24f1284 Follow crate conventions Ivar Flakstad 2024-01-01 20:37:56 +01:00
  • 6eb44d1bce Added fill bench Ivar Flakstad 2024-01-01 20:22:44 +01:00
  • 135ae5f3eb Simplify the one-hot implementation, support arbitrary rank. (#1514) Laurent Mazare 2024-01-01 11:40:17 +01:00
  • 41614b4a9b Add one-hot/cold encoding (#1489) Ryan Tate 2024-01-01 02:18:40 -08:00
  • 03ce8caf40 Format properly the Stable Diffusion example run with params (#1511) stano 2024-01-01 12:13:35 +02:00
  • b0fe5e4453 Do not implement Module for BatchNorm. (#1513) Laurent Mazare 2024-01-01 10:13:13 +01:00
  • 1fb2dd905c Add support for tiny-llama-1.1b. (#1512) Laurent Mazare 2023-12-31 12:18:25 +01:00
  • a0facd0e67 Small tweaks to batch-norm. (#1505) Laurent Mazare 2023-12-30 17:06:07 +01:00
  • 4290b81244 [Breaking] Add training to batchnorm with exponential moving average (#1504) nkoppel 2023-12-30 15:42:08 +00:00
  • 51e577a682 Add Policy Gradient to Reinforcement Learning examples (#1500) s-casci 2023-12-30 09:01:29 +01:00
  • 0a245e6fa4 Metal: support unary abs (#1503) Gonzalo 2023-12-29 20:00:12 -03:00
  • 87d7f81b43 Metal: more u8/u32 (#1502) Gonzalo 2023-12-29 19:56:21 -03:00
  • 4373534d59 Metal: i64 basic support (#1495) Gonzalo 2023-12-29 15:42:50 -03:00
  • 7fc26764b6 Implement generic fill. u8 uses speedy blit encoder Ivar Flakstad 2023-12-29 16:02:29 +01:00
  • f4a2787217 Merge pull request #1498 from huggingface/debugging_windows_ci Nicolas Patry 2023-12-29 12:33:50 +01:00
  • 0a29d2e9b8 Add fill kernel handler Ivar Flakstad 2023-12-29 12:27:12 +01:00
  • 488e02a3f6 Merge pull request #1496 from bayedieng/unary Nicolas Patry 2023-12-29 12:20:52 +01:00
  • adc95ca2bf Ignore skipped. Nicolas Patry 2023-12-29 12:15:57 +01:00
  • 4907c63ea1 Ignore stop on remote forks. Nicolas Patry 2023-12-29 12:12:10 +01:00
  • d76ac20e0e Fix. Nicolas Patry 2023-12-29 12:06:38 +01:00
  • f5c98f22c7 Merge pull request #1491 from mimiquate/metal-errors Nicolas Patry 2023-12-29 12:03:40 +01:00
  • fd9bf3bcdd remove stray # Ivar Flakstad 2023-12-29 11:39:49 +01:00
  • 90c74e199c Add metal fill kernel Ivar Flakstad 2023-12-29 11:38:13 +01:00
  • 5b12fbb143 Trying to fix flakyness by making hub_2 and hub_3 serial tests (potential issue on mingw with mmap). Nicolas Patry 2023-12-29 11:13:33 +01:00