candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	86b7c01b30	Update gemm to the latest version. (#1587 )	2024-01-15 09:44:51 +01:00
Laurent Mazare	bdd8107fda	Expose the ndarray trait. (#1586 )	2024-01-14 20:09:49 +01:00
Ivar Flakstad	ecf88a6d38	Merge branch 'main' into ivarflakstad/metal-prng	2024-01-14 17:10:54 +01:00
Laurent Mazare	e6d86b0819	Add the pow operator. (#1583 ) * Add the pow operator. * Support the pow operation in onnx.	2024-01-13 20:24:06 +01:00
Laurent Mazare	88618255cb	Fix the rotary embeddings for the new phi implementation. (#1582 ) * Fix the rotary embeddings for the new phi implementation. * Match the activation. * KV cache fix. * Use the config activation function.	2024-01-13 19:44:41 +01:00
Laurent Mazare	539ead927a	Update the Phi model to use the updated architecture. (#1580 ) * Update the Phi model to use the updated architecture. * Add more of the phi model. * Repeat KV + caching. * Apply the rotary embeddings. * Add support for the new phi model in the phi example. * Fix a couple glitches. * Fix a couple more glitches.	2024-01-13 17:38:27 +01:00
SebastianRueClausen	a46864bd56	Fix "Minimal Mamba" link in README. (#1577 )	2024-01-12 17:47:07 +01:00
Nicolas Patry	bafe95b660	Fix format. (#1576 )	2024-01-12 14:23:17 +01:00
ivarflakstad	a3d92ab226	Metal: Activate bfloat affine and add benchmark (#1543 ) * Use cfg to seperate benchmark results based on features * Add bfloat affine and benchmarks * Fix flops calculation * Remove allow pragma * Avoid some unnecessary returns. * Improve benchmarks layout --------- Co-authored-by: Laurent <laurent.mazare@gmail.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2024-01-12 11:19:49 +01:00
ivarflakstad	e90bcdcc7c	Metal: f16 and bf16 where_cond + benchmark (#1545 ) * Use cfg to seperate benchmark results based on features * Add metal where_cond for f16 and bf16. Add benchmark * Remove allow pragma * Avoid some unnecessary returns. * Improve benchmarks layout * Updated feature separated benchmarks --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-01-12 11:18:11 +01:00
Laurent Mazare	8e06bfb4fd	Mention VGG in the readme. (#1573 )	2024-01-12 09:59:29 +01:00
Laurent Mazare	6242276c09	Pin the revision used for phi-v2 + make it the default. (#1572 ) * Pin the revision used for phi-v2 + make it the default. * Tweak the custom-ops build.	2024-01-12 09:19:30 +01:00
Ivar Flakstad	e06e8d0dbe	fmt	2024-01-12 07:26:42 +01:00
Ivar Flakstad	e63bb8661b	Merge branch 'main' into ivarflakstad/metal-prng	2024-01-12 07:19:58 +01:00
Laurent Mazare	41915184bb	Bugfix for dequantizing q5k layers. (#1569 )	2024-01-11 23:15:11 +01:00
ivarflakstad	c1876b8041	Merge pull request #1567 from bayedieng/close-ifdef	2024-01-11 22:14:38 +01:00
Baye Dieng	85e5680277	remove metal version check	2024-01-11 21:02:03 +00:00
Baye Dieng	1327419776	close ifdef	2024-01-11 17:14:12 +00:00
Kyle McCarthy	402349d120	feat(bf16): add cast support + tests for cast + bin ops (#1524 )	2024-01-11 15:49:13 +01:00
ivarflakstad	9f0c99f0c1	Seperate benchmarks by enabled features (#1538 ) * Use cfg to seperate benchmark results based on features * Remove allow pragma * Avoid some unnecessary returns. * Improve benchmarks layout * Derive bench_name from actual device * Run CPU benchmarks even when GPU feature is enabled --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-01-11 15:35:38 +01:00
Laurent Mazare	0fc95c9f0c	Add a dequantize command to tensor-tools. (#1565 ) * Add a dequantize command to tensor-tools. * Clippy fixes.	2024-01-11 11:21:01 +01:00
Jani Monoses	2480c5dbdd	Add RepVGG model. (#1561 ) * Add RepVGG model. * Add RepVGG README * Extract var to top level * Replace hashmap with a match * Add a variant for the model kind + avoid some unnecessary config cloning. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-01-11 07:07:40 +01:00
Jani Monoses	63944714f2	Use candle_nn::embedding instead of local copies in a few models. (#1562 )	2024-01-10 21:36:27 +01:00
ivarflakstad	d3bdd788cf	Use __HAVE_BFLOAT__ to check for bfloat support instead of metal version check (#1540 )	2024-01-10 18:50:30 +01:00
Juarez Bochi	ae06cb74bb	Add relu kernel for metal (#1488 ) * Add relu kernel for metal * Copy error messages proposed in #1491 * Revert non relu changes * Fix name changes * Fix the last of us (: * Fix copy and paste mistakes * Fix typo * Revert order changes * Revert order change * Add deleted functions back * Run rustfmt	2024-01-10 18:27:17 +01:00
dependabot[bot]	a897fda74e	Update memmap2 requirement from 0.7.1 to 0.9.3 (#1556 ) Updates the requirements on [memmap2](https://github.com/RazrFalcon/memmap2-rs) to permit the latest version. - [Changelog](https://github.com/RazrFalcon/memmap2-rs/blob/master/CHANGELOG.md) - [Commits](https://github.com/RazrFalcon/memmap2-rs/compare/v0.7.1...v0.7.1) --- updated-dependencies: - dependency-name: memmap2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-10 16:27:59 +01:00
dependabot[bot]	1f1179913a	Update gloo requirement from 0.8 to 0.11 (#1558 ) Updates the requirements on [gloo](https://github.com/rustwasm/gloo) to permit the latest version. - [Release notes](https://github.com/rustwasm/gloo/releases) - [Changelog](https://github.com/rustwasm/gloo/blob/master/CHANGELOG.md) - [Commits](https://github.com/rustwasm/gloo/commits) --- updated-dependencies: - dependency-name: gloo dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-10 16:27:20 +01:00
dependabot[bot]	6e98cf2a92	Update cudarc requirement from 0.9.14 to 0.10.0 (#1559 ) Updates the requirements on [cudarc](https://github.com/coreylowman/cudarc) to permit the latest version. - [Release notes](https://github.com/coreylowman/cudarc/releases) - [Commits](https://github.com/coreylowman/cudarc/compare/v0.9.14...v0.9.15) --- updated-dependencies: - dependency-name: cudarc dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-10 16:27:05 +01:00
dependabot[bot]	2cc1247999	Update tokenizers requirement from 0.13.4 to 0.15.0 (#1555 ) Updates the requirements on [tokenizers](https://github.com/huggingface/tokenizers) to permit the latest version. - [Release notes](https://github.com/huggingface/tokenizers/releases) - [Changelog](https://github.com/huggingface/tokenizers/blob/main/RELEASE.md) - [Commits](https://github.com/huggingface/tokenizers/commits) --- updated-dependencies: - dependency-name: tokenizers dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-01-10 16:26:53 +01:00
darker	edf3fcd1c4	fix: deprecated option field (open-pull-requests-limit-per-dependency) (#1554 )	2024-01-10 15:12:46 +01:00
darker	53e4755015	feat: add dependabot to the project (#1553 ) * feat: add dependabot to the project * feat: add let's accept patches/fix from other libs * Revert "feat: add let's accept patches/fix from other libs" This reverts commit `d31a956f81`.	2024-01-10 14:57:20 +01:00
Ivar Flakstad	87efb5d8eb	Updated feature separated benchmarks	2024-01-09 19:04:31 +01:00
Ivar Flakstad	ad181f9cdc	Merge branch 'ivarflakstad/seperate-benchmarks-by-feature' into ivarflakstad/metal-prng	2024-01-09 18:55:40 +01:00
Ivar Flakstad	88945f2c22	Improve benchmarks layout	2024-01-09 18:31:28 +01:00
Laurent Mazare	12b2a337f3	Handle start-offset when loading a tensor from a pickle file. (#1546 )	2024-01-08 09:20:48 +01:00
Laurent	fb05af4c42	Avoid some unnecessary returns.	2024-01-08 07:19:59 +01:00
Ivar Flakstad	ad075a5f7e	Remove allow pragma	2024-01-08 06:48:33 +01:00
Laurent Mazare	0eb90ed783	Simpler repro for the neon optimization issue + bugfix (#1544 ) * Simpler repro for the neon optimization issue. * Bugfix for q4k. * Improve the fix, share the dot-prod bit. * Clippy fixes. * Fix for q6k. * Also fix for q2k. * Use the new shared dotprod. * Add more testing.	2024-01-07 20:21:49 +01:00
Laurent Mazare	89b5a06858	Use bindgen-cuda for the custom-kernel example. (#1536 ) * Use bindgen-cuda for the custom-kernel example. * Only depend on the kernels when cuda is enabled. * Skip rustfmt.	2024-01-07 17:18:46 +01:00
Ivar Flakstad	3f04a79ada	Use cfg to seperate benchmark results based on features	2024-01-07 14:40:15 +01:00
Nicolas Patry	30313c3081	Moving to a proper build crate `bindgen_cuda`. (#1531 ) * Moving to a proper build crate `bindgen_cuda`. * Fmt.	2024-01-07 12:29:24 +01:00
Laurent Mazare	e72d52b1a2	Unpin more of the workplace relative dependencies. (#1535 )	2024-01-07 12:26:20 +01:00
Nicolas Patry	b4cb982e49	Simplifying our internal cargo dependencies. (#1529 )	2024-01-07 12:04:14 +01:00
Ivar Flakstad	6ebe043273	Merge branch 'main' into ivarflakstad/metal-prng	2024-01-07 11:52:03 +01:00
Ivar Flakstad	6bf52b9fdf	Gaussian normal distribution of PRNG via Box-Muller transform	2024-01-07 11:39:46 +01:00
optman	84250bf52f	fix index_pos bug when kv cache is disabled. (#1517 ) * fix index_pos bug when kv cache is disabled * Tweak the fix. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-01-06 11:43:01 +01:00
OlivierDehaene	8d1a57c9a0	chore: update flash attention kernels (#1518 ) * chore: update flash attention kernels * fmt * remove unused kernels * force f32 * correct stride	2024-01-05 18:28:55 +01:00
Ivar Flakstad	955e63c803	Implement hybrid Tausworthe + LCG psuedo random number generator in metal	2024-01-05 13:27:59 +01:00
Jeroen Vlek	3a7304cb0d	add link to gpt-from-scratch-rs (#1525 )	2024-01-05 11:59:46 +01:00
Nicolas Patry	fa3ea98ba9	Adding bfloat16 support for the cast kernels. (#1520 )	2024-01-04 12:12:56 +01:00

1 2 3 4 5 ...

1761 Commits