candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Jani Monoses	2480c5dbdd	Add RepVGG model. (#1561 ) * Add RepVGG model. * Add RepVGG README * Extract var to top level * Replace hashmap with a match * Add a variant for the model kind + avoid some unnecessary config cloning. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-01-11 07:07:40 +01:00
Jani Monoses	63944714f2	Use candle_nn::embedding instead of local copies in a few models. (#1562 )	2024-01-10 21:36:27 +01:00
ivarflakstad	d3bdd788cf	Use __HAVE_BFLOAT__ to check for bfloat support instead of metal version check (#1540 )	2024-01-10 18:50:30 +01:00
Juarez Bochi	ae06cb74bb	Add relu kernel for metal (#1488 ) * Add relu kernel for metal * Copy error messages proposed in #1491 * Revert non relu changes * Fix name changes * Fix the last of us (: * Fix copy and paste mistakes * Fix typo * Revert order changes * Revert order change * Add deleted functions back * Run rustfmt	2024-01-10 18:27:17 +01:00
dependabot[bot]	a897fda74e	Update memmap2 requirement from 0.7.1 to 0.9.3 (#1556 ) Updates the requirements on [memmap2](https://github.com/RazrFalcon/memmap2-rs) to permit the latest version. - [Changelog](https://github.com/RazrFalcon/memmap2-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/RazrFalcon/memmap2-rs/compare/v0.7.1...v0.7.1) --- updated-dependencies: - dependency-name: memmap2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-10 16:27:59 +01:00
dependabot[bot]	1f1179913a	Update gloo requirement from 0.8 to 0.11 (#1558 ) Updates the requirements on [gloo](https://github.com/rustwasm/gloo) to permit the latest version. - [Release notes](https://github.com/rustwasm/gloo/releases) - [Changelog](https://github.com/rustwasm/gloo/blob/master/CHANGELOG.md) - [Commits](https://github.com/rustwasm/gloo/commits) --- updated-dependencies: - dependency-name: gloo dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-10 16:27:20 +01:00
dependabot[bot]	6e98cf2a92	Update cudarc requirement from 0.9.14 to 0.10.0 (#1559 ) Updates the requirements on [cudarc](https://github.com/coreylowman/cudarc) to permit the latest version. - [Release notes](https://github.com/coreylowman/cudarc/releases) - [Commits](https://github.com/coreylowman/cudarc/compare/v0.9.14...v0.9.15) --- updated-dependencies: - dependency-name: cudarc dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-10 16:27:05 +01:00
dependabot[bot]	2cc1247999	Update tokenizers requirement from 0.13.4 to 0.15.0 (#1555 ) Updates the requirements on [tokenizers](https://github.com/huggingface/tokenizers) to permit the latest version. - [Release notes](https://github.com/huggingface/tokenizers/releases) - [Changelog](https://github.com/huggingface/tokenizers/blob/main/RELEASE.md) - [Commits](https://github.com/huggingface/tokenizers/commits) --- updated-dependencies: - dependency-name: tokenizers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-10 16:26:53 +01:00
darker	edf3fcd1c4	fix: deprecated option field (open-pull-requests-limit-per-dependency) (#1554 )	2024-01-10 15:12:46 +01:00
darker	53e4755015	feat: add dependabot to the project (#1553 ) * feat: add dependabot to the project * feat: add let's accept patches/fix from other libs * Revert "feat: add let's accept patches/fix from other libs" This reverts commit `d31a956f81`.	2024-01-10 14:57:20 +01:00
Ivar Flakstad	87efb5d8eb	Updated feature separated benchmarks	2024-01-09 19:04:31 +01:00
Ivar Flakstad	ad181f9cdc	Merge branch 'ivarflakstad/seperate-benchmarks-by-feature' into ivarflakstad/metal-prng	2024-01-09 18:55:40 +01:00
Ivar Flakstad	88945f2c22	Improve benchmarks layout	2024-01-09 18:31:28 +01:00
Laurent Mazare	12b2a337f3	Handle start-offset when loading a tensor from a pickle file. (#1546 )	2024-01-08 09:20:48 +01:00
Laurent	fb05af4c42	Avoid some unnecessary returns.	2024-01-08 07:19:59 +01:00
Ivar Flakstad	ad075a5f7e	Remove allow pragma	2024-01-08 06:48:33 +01:00
Laurent Mazare	0eb90ed783	Simpler repro for the neon optimization issue + bugfix (#1544 ) * Simpler repro for the neon optimization issue. * Bugfix for q4k. * Improve the fix, share the dot-prod bit. * Clippy fixes. * Fix for q6k. * Also fix for q2k. * Use the new shared dotprod. * Add more testing.	2024-01-07 20:21:49 +01:00
Laurent Mazare	89b5a06858	Use bindgen-cuda for the custom-kernel example. (#1536 ) * Use bindgen-cuda for the custom-kernel example. * Only depend on the kernels when cuda is enabled. * Skip rustfmt.	2024-01-07 17:18:46 +01:00
Ivar Flakstad	3f04a79ada	Use cfg to seperate benchmark results based on features	2024-01-07 14:40:15 +01:00
Nicolas Patry	30313c3081	Moving to a proper build crate `bindgen_cuda`. (#1531 ) * Moving to a proper build crate `bindgen_cuda`. * Fmt.	2024-01-07 12:29:24 +01:00
Laurent Mazare	e72d52b1a2	Unpin more of the workplace relative dependencies. (#1535 )	2024-01-07 12:26:20 +01:00
Nicolas Patry	b4cb982e49	Simplifying our internal cargo dependencies. (#1529 )	2024-01-07 12:04:14 +01:00
Ivar Flakstad	6ebe043273	Merge branch 'main' into ivarflakstad/metal-prng	2024-01-07 11:52:03 +01:00
Ivar Flakstad	6bf52b9fdf	Gaussian normal distribution of PRNG via Box-Muller transform	2024-01-07 11:39:46 +01:00
optman	84250bf52f	fix index_pos bug when kv cache is disabled. (#1517 ) * fix index_pos bug when kv cache is disabled * Tweak the fix. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-01-06 11:43:01 +01:00
OlivierDehaene	8d1a57c9a0	chore: update flash attention kernels (#1518 ) * chore: update flash attention kernels * fmt * remove unused kernels * force f32 * correct stride	2024-01-05 18:28:55 +01:00
Ivar Flakstad	955e63c803	Implement hybrid Tausworthe + LCG psuedo random number generator in metal	2024-01-05 13:27:59 +01:00
Jeroen Vlek	3a7304cb0d	add link to gpt-from-scratch-rs (#1525 )	2024-01-05 11:59:46 +01:00
Nicolas Patry	fa3ea98ba9	Adding bfloat16 support for the cast kernels. (#1520 )	2024-01-04 12:12:56 +01:00
Laurent Mazare	135ae5f3eb	Simplify the one-hot implementation, support arbitrary rank. (#1514 ) * Simplify the one-hot implementation, support arbitrary rank. * More cleanup.	2024-01-01 11:40:17 +01:00
Ryan Tate	41614b4a9b	Add one-hot/cold encoding (#1489 ) * add one-hot encoding * one_hot: improve error handling, use generic to_vecN::<D> Bails if the index value is equal to or greater than the depth value, which would result in an out-of-bounds error. A redundant check is added to ensure the index value does not exceed the length of the one-hot matrix size, which would also result in an out-of-bounds error. Bails if the index value is less than -1. If the index value is -1, then it ignores the setting of the on_value for the index value. Only values that are less than -1 are considered errors. * one-hot: use two generics, one_hot::<I, O>, for input and output data types Separating the input and output data types allows the input tensor indices to be a different data type than the output encoded tensor data type. For example, one_hot::<i64, u8>(...) will take an input tensor of i64 values and encode the output tensor using u8 values. The generic I::DTYPE must match the data type of the input indices, otherwise the method will bail. Additionally, this method adds an `allow_f64` option to enable the input indices data type to be f64 values. f64 values are disabled by default. TODO: indices data type and the generic I data type are currently not compile-time checked. * one_hot: remove input generic, use indices dtype matching This commit removes the to_f64() type cast and explicitly matches the DType from the input tensor. Currently, only U8, U32 and I64 is supported for input tensors. The match arms on the dtype is verbose. It would be nice to use a generic type with the WithDtype traitbound to pass to the to_vecN method and then return an inner value. Open to suggestions for better approaches here to reduce the match arm verbosity. * one_hot: use flat_map iterator over dims instead of nested for loop This commit replaces the nested for loops with an flat map iter over the dimensions of the input tensor. This commit also adds a test for a rank 3 input tensor. * one_hot: use mandatory on/off-values, remove const msgs This commit also updates doc tests, comments and test cases. * Small cleanups. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-01-01 11:18:40 +01:00
stano	03ce8caf40	Format properly the Stable Diffusion example run with params (#1511 ) Move out the --sd-version flag out of the prompt.	2024-01-01 11:13:35 +01:00
Laurent Mazare	b0fe5e4453	Do not implement Module for BatchNorm. (#1513 )	2024-01-01 10:13:13 +01:00
Laurent Mazare	1fb2dd905c	Add support for tiny-llama-1.1b. (#1512 )	2023-12-31 12:18:25 +01:00
Laurent Mazare	a0facd0e67	Small tweaks to batch-norm. (#1505 )	2023-12-30 17:06:07 +01:00
nkoppel	4290b81244	[Breaking] Add training to batchnorm with exponential moving average (#1504 ) * Add training to batchnorm with exponential moving average * Add more checks to batch norm * Resolve some review comments * Add with_momentum varients of `new` methods * Add check for range of momentum variable; update batch norm test * Run cargo fmt * Add back num_features parameter * Format; tiny simplification	2023-12-30 16:42:08 +01:00
s-casci	51e577a682	Add Policy Gradient to Reinforcement Learning examples (#1500 ) * added policy_gradient, modified main, ddpg and README * fixed typo in README * removed unnecessary imports * small refactor * Use clap for picking up the subcommand to run. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2023-12-30 09:01:29 +01:00
Gonzalo	0a245e6fa4	Metal: support unary abs (#1503 ) * Metal: support unary abs * cargo fmt	2023-12-30 00:00:12 +01:00
Gonzalo	87d7f81b43	Metal: more u8/u32 (#1502 ) * Adds more metal u8 * Metal: more u32	2023-12-29 23:56:21 +01:00
Gonzalo	4373534d59	Metal: i64 basic support (#1495 ) * Adds basic metal i64 support * metal copy i64	2023-12-29 19:42:50 +01:00
Nicolas Patry	f4a2787217	Merge pull request #1498 from huggingface/debugging_windows_ci Fix CI	2023-12-29 12:33:50 +01:00
Nicolas Patry	488e02a3f6	Merge pull request #1496 from bayedieng/unary Implement urecip op for metal backend	2023-12-29 12:20:52 +01:00
Nicolas Patry	adc95ca2bf	Ignore skipped.	2023-12-29 12:15:57 +01:00
Nicolas Patry	4907c63ea1	Ignore stop on remote forks.	2023-12-29 12:12:10 +01:00
Nicolas Patry	d76ac20e0e	Fix.	2023-12-29 12:06:38 +01:00
Nicolas Patry	f5c98f22c7	Merge pull request #1491 from mimiquate/metal-errors Improves metal's not implemented error messages	2023-12-29 12:03:40 +01:00
Nicolas Patry	5b12fbb143	Trying to fix flakyness by making hub_2 and hub_3 serial tests (potential issue on mingw with mmap).	2023-12-29 11:13:33 +01:00
Baye Dieng	cc06ba2294	fix bad pattern matching and function name	2023-12-29 09:46:24 +00:00
Nicolas Patry	a6bd0b47a5	Fix the CI.	2023-12-29 10:17:52 +01:00
Baye Dieng	b59b1b2bb6	remove generated png	2023-12-28 21:50:58 +00:00

1 2 3 4 5 ...

1740 Commits