Commit Graph

  • 918136ba46 add quantized rwkv v5 model (#1743) Jack Shih 2024-02-26 04:43:40 +08:00
  • 1a6043af51 Tweak the VarMap set type. (#1758) Laurent Mazare 2024-02-25 20:50:08 +01:00
  • 2f22afd80e Cuda acceleration for quantized model. (#1754) Laurent Mazare 2024-02-25 18:11:47 +01:00
  • 8d04f70f4d Fix the eos token for gemma. (#1753) Laurent Mazare 2024-02-24 11:07:02 +01:00
  • eeb7e2b683 Apply rustfmt to the newly added tests. (#1749) Laurent Mazare 2024-02-23 06:48:28 +01:00
  • 11ea7aac4d tests (#1724) Sacha Arbonel 2024-02-23 11:05:46 +05:30
  • 32eb56d6b3 Fix typo in README (#1740) Daniel Varga 2024-02-22 12:35:26 +01:00
  • 28057781aa Make the cache for the llama model explicit too. (#1745) Laurent Mazare 2024-02-22 12:04:33 +01:00
  • 544018b6d0 Explicit caching in llama2.c. laurent 2024-02-22 10:22:03 +01:00
  • c753f72c85 Support for attention bias in gemma + refactor things a bit. (#1744) Laurent Mazare 2024-02-22 09:35:28 +01:00
  • 8013b50829 Add grads for interpolate1d (#1742) Kirpal Grewal 2024-02-22 07:44:01 +00:00
  • 45d5322d62 Add the Gemma models. (#1741) Laurent Mazare 2024-02-21 22:02:50 +01:00
  • a2cb2edead Add a couple backtraces on cpu errors. (#1738) Laurent Mazare 2024-02-20 19:54:13 +01:00
  • fc67d878bb Bugfix for conv-transpose1d (#1734) Laurent Mazare 2024-02-19 09:04:49 +01:00
  • 3ba37443e5 Bugfix for applying the bias in conv1d-transpose. (#1732) Laurent Mazare 2024-02-18 22:51:20 +01:00
  • 1fb728772d Support for groups in conv-transpose1d. (#1731) Laurent Mazare 2024-02-18 21:28:07 +01:00
  • cb86b0c82c Fix float unpickling. (#1730) Laurent Mazare 2024-02-18 19:33:55 +01:00
  • 6284ad784c Module implementation for options. (#1728) Laurent Mazare 2024-02-18 14:12:55 +01:00
  • 678d44a7f6 Expose the weights and biases in transposed convolutions. (#1727) Laurent Mazare 2024-02-18 10:35:01 +01:00
  • 41416d2376 Expose more conv1d functions/structs. (#1726) Laurent Mazare 2024-02-17 18:50:55 +01:00
  • 5ebcfeaf0f Make the r, k, v tensors contiguous. (#1719) Laurent Mazare 2024-02-16 09:17:35 +01:00
  • 7c7400fb63 Use the tokenizer-output-stream in the llama example. (#1715) Laurent Mazare 2024-02-15 16:47:33 +01:00
  • 3f3730b657 Preliminary implementation for the vocos model. vocos laurent 2024-02-14 22:16:09 +01:00
  • 058a910d0e Add a readme for rwkv. (#1712) Laurent Mazare 2024-02-14 15:31:33 +01:00
  • 26fe162ab5 Custom tokenizer for rwkv. (#1711) Laurent Mazare 2024-02-14 15:11:38 +01:00
  • 121a71e01f Fix the silu cuda kernel. (#1710) Laurent Mazare 2024-02-14 11:08:18 +01:00
  • 2d5f2a728d Add the RWKV model (v5). (#1707) Laurent Mazare 2024-02-14 10:58:32 +01:00
  • 68f7655895 Add ConvNeXt-V2 and smaller model variants. (#1709) Jani Monoses 2024-02-14 11:53:07 +02:00
  • b60064780d feat: add silu activation function (#1706) OlivierDehaene 2024-02-14 10:27:22 +01:00
  • e2bf0adc2a [WIP] Bf16 support. bf16_metal Nicolas Patry 2024-02-13 22:44:11 +01:00
  • 14010a8498 Update our cuda runner. (#1705) Nicolas Patry 2024-02-13 19:06:15 +01:00
  • 0de0795220 Qmetal tweaks (#1704) Laurent Mazare 2024-02-13 18:11:17 +01:00
  • c1b418586c Fixing quantized llama demo on metal. (#1703) Nicolas Patry 2024-02-13 16:28:56 +01:00
  • ad73e93da2 Detach the tensors on batch-norm eval. (#1702) Laurent Mazare 2024-02-13 14:26:32 +01:00
  • 13c67226e6 feat: support microphone whisper streaming (#1678) drbh 2024-02-12 12:01:21 -05:00
  • d0aa197b07 ConvTranspose1d cuda support. (#1697) Laurent Mazare 2024-02-12 15:03:18 +01:00
  • 274bf11633 Support defaultdict in PyTorch checkpoints. (#1696) Laurent Mazare 2024-02-12 10:26:56 +01:00
  • 1e26d539d9 Improved mamba model optimized for inference (#1694) Laurent Mazare 2024-02-11 17:04:57 +01:00
  • 74497e6bf7 Fixing the qwen tokenizer location. (#1693) Nicolas Patry 2024-02-11 08:52:36 +01:00
  • 8ab384e63d docs: add trocr examples (#1692) Todsaporn Banjerdkit 2024-02-10 22:14:50 +07:00
  • 27ffd644a9 Mention TrOCR in the readmes. (#1691) Laurent Mazare 2024-02-10 15:49:38 +01:00
  • bf20cc854c Support sinusoidal embeddings in trocr. (#1690) Laurent Mazare 2024-02-10 15:17:51 +01:00
  • 42ce593ec6 Use the repo config for trocr rather than hardcoding it + small tweaks. (#1689) Laurent Mazare 2024-02-10 13:15:03 +01:00
  • 67589791d2 Remove the unused pragma in vit + handle the final layernorm. (#1688) Laurent Mazare 2024-02-10 11:08:50 +01:00
  • 1c8d61f051 ChatGLM custom tokenizer. (#1687) Laurent Mazare 2024-02-10 10:47:04 +01:00
  • 90447bc993 Add the custom tokenizer. (#1686) Laurent Mazare 2024-02-09 17:36:50 +01:00
  • 40ce16001b Use the proper endoftext token for gwen. (#1685) Laurent Mazare 2024-02-09 17:02:03 +01:00
  • 5657e596cd Add the Qwen2 model (#1684) Laurent Mazare 2024-02-09 15:02:49 +01:00
  • 0dee8ea19b Add the ChatGLM model. (#1237) Laurent Mazare 2024-02-09 11:51:38 +01:00
  • 9cadd4e644 feat: support multithread spectrogram and small perf tweaks (#1674) drbh 2024-02-08 15:54:12 -05:00
  • 020a979de2 Fix clippy lints for 1.76. (#1682) Laurent Mazare 2024-02-08 16:48:47 +01:00
  • cdc3823d8f Pickle support: dig within the _rebuild_parameter calls. (#1681) Laurent Mazare 2024-02-08 13:09:49 +01:00
  • e5eb9602d0 Add support for loading Fortran contiguous tensors (#1672) Dilshod Tadjibaev 2024-02-07 14:49:59 -06:00
  • b75e8945bc Enhance pickle to retrieve state_dict with a given key (#1671) Dilshod Tadjibaev 2024-02-06 14:17:33 -06:00
  • a90fc5ca5a Add VarBuilder::from_backend (#1670) Daniël de Kok 2024-02-06 15:26:11 +01:00
  • adfae2460a Fix rustfmt. (#1669) Laurent Mazare 2024-02-06 12:06:06 +01:00
  • 678f64dd27 Fix token generation in bilingual models (non-English outputs) (#1668) Guoqing Bao 2024-02-06 19:03:53 +08:00
  • b545f54a19 Fix clippy lints. (#1667) Laurent Mazare 2024-02-06 09:03:36 +01:00
  • 1ba11f22d6 Fix: pth files don't load on Windows (#1661) Roma Klapaukh 2024-02-06 18:50:55 +11:00
  • 982722019b add roll function to tensor (#1666) Jiayu Liu 2024-02-06 15:49:45 +08:00
  • a83ca2ece0 Bump the crate version to 0.4.0. (#1658) Laurent Mazare 2024-02-04 19:08:01 +01:00
  • 153c940a9c Update docs to reflect current usage of example (#1610) Tarek 2024-02-04 10:59:47 +00:00
  • 50be8a98ba Quantized support for stable-lm2. (#1654) Laurent Mazare 2024-02-04 11:57:05 +01:00
  • 58cc896e69 make llama derive clone (#1648) Daniel Clough 2024-02-04 02:56:03 -08:00
  • 5cdd84e0f6 onnx: add the Flatten operator. (#1638) wanglong001 2024-02-03 23:28:47 +08:00
  • a510ddec4e Mention the new models in the readme. (#1651) Laurent Mazare 2024-02-03 15:19:57 +01:00
  • d32abbce53 Add StableLM-2, StableLM Code and Zephyr variants (#1650) Jani Monoses 2024-02-03 15:58:41 +02:00
  • dfab45e1c8 Supports more audio formats (#1628) Hubert Shelley 2024-02-03 21:26:04 +08:00
  • 96bc704d17 Update mixformer.rs (#1601) Bayang 2024-02-03 13:42:16 +01:00
  • a52d407ae6 Add ConvNeXt model. (#1604) Jani Monoses 2024-02-03 14:34:28 +02:00
  • 9e824ec810 Explicit version for packages that are not in the workspace. (#1642) Laurent Mazare 2024-01-31 18:57:38 +01:00
  • beadb1b434 Explicit candle version so that cargo publish can be used easily. (#1641) Laurent Mazare 2024-01-31 18:42:22 +01:00
  • 8babfe0411 Fixed all bugs. Improved code quality. Added tests. ivarflakstad/metal-reduce-2 Ivar Flakstad 2024-01-30 14:12:57 +01:00
  • 6d83d42efb Merge pull request #1606 from FL33TW00D/feature/larger-batches Christopher Fleetwood 2024-01-29 15:31:10 +00:00
  • 933716b374 Where cond get_strided_index conditionally based on function constants ivarflakstad/metal-where-cond-2 Ivar Flakstad 2024-01-22 20:59:02 +01:00
  • 077e781f53 fmt Ivar Flakstad 2024-01-22 21:23:44 +01:00
  • ceaf7f1e2d More concise macros ivarflakstad/metal-fill Ivar Flakstad 2024-01-22 21:17:20 +01:00
  • f8abfee854 Merge branch 'main' into ivarflakstad/metal-fill Ivar Flakstad 2024-01-22 20:54:11 +01:00
  • 086b6ef6b6 Merge branch 'main' into ivarflakstad/metal-reduce-2 Ivar Flakstad 2024-01-22 18:41:46 +01:00
  • 2056866c25 Improve softmax kernel. 33%-39% higher thrpt Ivar Flakstad 2024-01-22 18:25:52 +01:00
  • b6afb46601 chore: final FL33TW00D 2024-01-22 15:15:19 +00:00
  • 1f4c54493e Improve arg reduce and add contiguous impl Ivar Flakstad 2024-01-21 18:12:49 +01:00
  • d5902840e0 Improve reduce perf and add contiguous impl Ivar Flakstad 2024-01-21 17:32:21 +01:00
  • fd7c856564 Merge pull request #1533 from huggingface/ivarflakstad/metal-prng ivarflakstad 2024-01-22 07:30:20 +01:00
  • 73d79e6092 chore: actual fix FL33TW00D 2024-01-19 09:35:42 +00:00
  • b1879f17f6 chore: switch to buffer FL33TW00D 2024-01-19 08:57:49 +00:00
  • 4f79f5df8a fix: larger batches FL33TW00D 2024-01-18 14:30:14 +00:00
  • b9ce263e4d Metal version check for fill_i64 Ivar Flakstad 2024-01-18 12:07:49 +01:00
  • 5c6d5c3d0e Merge branch 'main' into ivarflakstad/metal-fill Ivar Flakstad 2024-01-18 11:16:25 +01:00
  • 1cf34368b7 Merge pull request #1602 from mimiquate/fix-metal-kernel-type ivarflakstad 2024-01-18 08:40:34 +01:00
  • 17e6e2d7ee Fixes metal kernel u8 type Gonzalo 2024-01-17 15:47:08 -03:00
  • 80b1c689f9 Revert public EncoderParam Ivar Flakstad 2024-01-17 18:09:28 +01:00
  • db923517b3 Merge branch 'main' into ivarflakstad/metal-prng Ivar Flakstad 2024-01-17 18:02:01 +01:00
  • 403680f17d Quantized GGUF style (#1523) Nicolas Patry 2024-01-17 10:27:58 +01:00
  • 86a8e58897 Update metal random kernel and set_seed method Ivar Flakstad 2024-01-16 19:11:31 +01:00
  • 5270224f40 Add MobileOne model. (#1595) Jani Monoses 2024-01-16 07:34:16 +02:00
  • 7e3349d7c3 Update parquet requirement from 45.0.0 to 50.0.0 (#1592) dependabot[bot] 2024-01-15 22:35:01 +01:00
  • 1257fc6719 Update safetensors requirement from 0.3.1 to 0.4.1 (#1591) dependabot[bot] 2024-01-15 22:34:40 +01:00
  • 67d93b4f42 More happy tests. metal5 Nicolas Patry 2024-01-15 18:46:18 +01:00
  • c35d7d50db Making the CI happy. Nicolas Patry 2024-01-15 18:30:42 +01:00