candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-15 10:26:33 +00:00

Author	SHA1	Message	Date
Laurent Mazare	27996a1a9e	Remove the old MFA gemm kernels. (#2742 ) * Remove the old MFA gemm kernels. * Use bf16 in helium on metal.	2025-01-26 20:36:31 +01:00
Laurent Mazare	1a32107fab	Add a few metal gather ops. (#2740 ) * Add a few metal gather ops. * Fix some compilation issues. * Adjust the tolerance.	2025-01-25 23:31:03 +01:00
唐璜	333d94a19a	fix: fix the codegeex4 model examples and transformers model (#2738 ) * Update main.rs * Update codegeex4_9b.rs * Get things to compile. * Add some default for when rope_ratio is missing. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2025-01-25 17:41:12 +01:00
mneilly	3164a19a5d	Add inpainting to the stable diffusion example (#2735 ) * Update the stable diffusion example with inpainting support for 1.5, 2 and XL. * Apply cargo fmt. * Clippy fixes. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2025-01-23 10:08:38 +01:00
Sergei Grebnov	e6cd499e98	Fix candle-flash-attn build on Windows (msvc) (#2734 )	2025-01-22 22:19:48 +01:00
Laurent Mazare	77db8396d0	Explicit error when slice-set is called with the same src and dst. (#2733 )	2025-01-22 21:31:49 +01:00
Laurent Mazare	85f0aaefe5	Add serde::serialize to activations. (#2732 )	2025-01-22 10:23:34 +01:00
Guoqing Bao	e4c3a71f11	Fix GLM4 alignment issue (#2723 ) * Fix GLM4 alignment issue * Cleanups. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2025-01-20 22:51:46 +01:00
Eric Buehler	17cbbe4286	Sync upstream MLX sdpa vector kernels with mask (#2718 ) * Sync upstream mlx sdpa vector kernels with mask * Dispatch to the 2pass kernel * Format	2025-01-16 11:30:10 +01:00
Laurent Mazare	6fd2f63a15	Bump the ug dependency. (#2720 ) * Bump the ug dependency. * Fix some test. * Fix the ug test.	2025-01-16 09:39:16 +01:00
Laurent Mazare	efd0e6822f	Fix the helium weights download. (#2717 )	2025-01-13 18:21:37 +01:00
Laurent Mazare	158817f230	Helium repo update. (#2716 )	2025-01-13 18:04:14 +01:00
Laurent Mazare	309cd0f7c7	Add the helium model. (#2715 )	2025-01-13 17:39:49 +01:00
Jani Monoses	ab7ff7081e	Fixes for running Phi-4 quantized. (#2714 )	2025-01-13 14:35:33 +01:00
Jani Monoses	461e8c1685	ModernBERT model (#2713 ) * layer_norm_no_bias * Modernbert model. * Format + cleanup error. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2025-01-13 08:39:27 +01:00
Laurent Mazare	2344c4e4b8	Clippy fixes for 1.84. (#2710 )	2025-01-10 10:15:15 +01:00
Laurent Mazare	32defdb7d5	Update cudarc. (#2708 )	2025-01-08 15:10:23 +01:00
Laurent Mazare	236c35e578	Bump the caret version to 0.8.2. (#2703 ) 0.8.2	2025-01-07 15:50:16 +01:00
Andrei Fajardo	6f8351dfda	add link to README (#2701 )	2025-01-04 23:07:30 +01:00
Luka Zakrajšek	57f41da13b	Fix mistral attention on Metal (#2699 ) Co-authored-by: Luka Zakrajsek <luka.zakrajsek@soniox.com>	2025-01-04 16:11:20 +01:00
Nick Senger	cbaa0ad46f	UniPC for diffusion sampling (#2684 ) * feat: Add unipc multistep scheduler * chore: Clippy and formatting * chore: Update comments * chore: Avoid unsafety in float ordering * refactor: Update Scheduler::step mutability requirements * fix: Corrector img2img * chore: Update unipc ref link to latest diffusers release * chore: Deduplicate float ordering * fix: Panic when running with dev profile	2025-01-01 21:34:17 +01:00
Laurent Mazare	b12c7c2888	Update the hf-hub dependency to 0.4.0. (#2691 ) * Update the hf-hub dependency to 0.4.0. * Fix the book. * Use 0.4.1.	2024-12-31 19:07:47 +01:00
Laurent Mazare	94ffc2ec6f	Actually remove the default hf-hub cache path for glm. (#2696 )	2024-12-31 11:00:44 +01:00
Laurent Mazare	7354afc673	Use the default hf-hub cache for glm. (#2695 )	2024-12-31 10:55:45 +01:00
Michael Feil	2a705e6f37	Flash-Attn upgrade / SoftCap Candle-FlashAttn [3/n] (#2690 ) * update flash-attn v1 * restore: hdim224 * add 224 flash_fwd_template * remove whitespace * softcap is working, including test and api. * make softcap test case better * unpadded lse added	2024-12-31 10:04:47 +01:00
Michael Feil	a594ef669c	Flash-Attn upgrade / SoftCap Candle-FlashAttn [2/n] (#2689 ) * update flash-attn v1 * restore: hdim224 * add 224 flash_fwd_template * remove whitespace * softcap is working, including test and api. * make softcap test case better --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-12-31 09:41:23 +01:00
Michael Feil	71cd6d5533	Flash-Attn upgrade / SoftCap Candle-FlashAttn [1/n] (#2688 ) * update flash-attn v1 * restore: hdim224 * add 224 flash_fwd_template * remove whitespace	2024-12-31 09:32:22 +01:00
Laurent Mazare	d60eba1408	Streamline the glm4 example. (#2694 )	2024-12-31 09:21:41 +01:00
Laurent Mazare	e38e2a85dd	Fix a cuda warning. (#2693 )	2024-12-31 09:06:10 +01:00
jetsung	460616fc84	Update README.org (#2670 ) The command line error in the CPU section of the documentation.	2024-12-30 11:32:02 +01:00
Akshay Ballal	91f1f019b1	Added XLMRobertaModel for Reranking (#2686 ) * add xlm-roberta-base * Add task enum for fill-mask and reranker in xlm-roberta example; update README and fix attention mask dimensions - Introduced a new `Task` enum to replace string task identifiers in the xlm-roberta example. - Updated the logic in `main.rs` to handle tasks using the new enum. - Enhanced README with example output for fill-mask task. - Fixed dimension retrieval in `prepare_4d_attention_mask` function for better clarity and safety. * Clippy fix. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-12-30 11:16:57 +01:00
mert-kurttutan	cd639131f0	Fix bug in whisper transformer (#2681 ) * Fix bug in whisper transformer - due to num_threads going to zero in single threaded case * Apply rustfmt. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-12-24 13:58:21 +01:00
hhllhhyyds	11aa30be10	Fix Batcher iterator break when return_last_incomplete_batch and items.is_empty (#2654 ) (#2655 )	2024-12-24 08:41:26 +01:00
Amélie Royer	1be6b090c7	Fix position encodings for Pixtral (#2678 ) * init commit: add position id in meshgrid * pass in subsampled positions * clippy fix * clippy fix	2024-12-23 13:22:35 +01:00
Laurent Mazare	62ced44ea9	Add a Context trait similar to anyhow::Context. (#2676 ) * Add a Context trait similar to anyhow::Context. * Switch two unwrap to context.	2024-12-22 09:18:13 +01:00
Edgar Riba	5c2f893e5a	make DepthAnythingV2 more reusable (#2675 ) * make DepthAnythingV2 more reusable * Fix clippy lints. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-12-21 12:06:03 +01:00
Laurent Mazare	67cab7d6b8	Bump the crate version to 0.8.1. (#2662 ) 0.8.1	2024-12-07 17:03:53 +01:00
Justin Sing	1807be84f4	Change/bert encoder public (#2658 ) * change: BertEncoder struct to public * change: make certain fields in Config struct public * change: all fields in bert config struct to be public * change: add clone to bert encoder and others * Clippy fix. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-12-04 21:22:30 +01:00
cdoko	145aa7193c	Add Nvembed v2 model (#2649 ) * Update mod.rs * Create mod.rs * Create decoder.rs * Create model.rs * Create main.rs * Create README.md * Update README.md * Update main.rs * Update and rename decoder.rs to embedding.rs * Update mod.rs * Update model.rs	2024-12-03 10:56:01 +01:00
zachcp	6f715f9256	add scatter add (#2656 )	2024-12-01 18:39:38 +01:00
zachcp	dba7a9c93e	add u32 - U32 gather (#2653 )	2024-11-30 23:18:07 +01:00
Laurent Mazare	b52c2c6050	Clippy fixes for the cuda feature. (#2650 )	2024-11-29 09:01:34 +01:00
iskng	4f59ed38b0	Adds support for stella_en_v5 embedding model -400M variant (#2608 ) * Adds support for stella_en_v5 embedding model -400M variant * Unified stella * WIP: Unified Stella * Combined stella for both 1.5B and 400M variants * Cargo fmt for the CI * removed redundant stella-400m model and example after merge into stella-en-v5 * cargo fmt --all --------- Co-authored-by: Anubhab Bandyopadhyay <4890833+AnubhabB@users.noreply.github.com> Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-11-29 09:01:08 +01:00
Anubhab Bandyopadhyay	54e7fc3c97	Lint fixes introduced with Rust 1.83 (#2646 ) * Fixes for lint errors introduced with Rust 1.83 * rustfmt * Fix more lints. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-11-28 23:00:21 +01:00
Adam Nelson	23ed8a9ded	Fix for whisper-microphone example failure if audio isn't chunk aligned (#2645 ) At least on my macOS Sequoia system (MBP 14" 2021, M1 Pro), when I run the `whisper-microphone` example after it has gathered 10 seconds of audio, it fails before the transcription: ``` Error: Insufficient buffer size 384 for input channel 0, expected 1024 ``` At least for the audio device I'm using (Airpods Pro Max), there is no guarantee that each audio buffer is a multiple of 1024 samples. Thus at the end of the 10 seconds, `buffered_pcm` can have some samples at the end that do not form a complete 1024 sample chunk. This fixes that by tracking when there is a partial chunk at the end of the buffer, and leaving it in `buffered_pcm` to be processed on the next loop iteration. Note that, in the interest of keeping this PR as small as possible, I didn't make any other changes to this example.	2024-11-27 22:35:11 +01:00
Ionut Mihalcea	21c686387c	Onnx Support for Sign operation #2641 (#2642 ) * Support for Sign operation #2641 * Apply rustfmt. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-11-26 23:10:09 +01:00
zachcp	b4deb5c5a9	Provide a method to allow PTH files with state maps to be loaded. (#2639 ) * Provide a method to allow PTH files iwth state maps to be loaded. * add a line to the doc * String-. &str	2024-11-26 22:52:53 +01:00
Andrei Fajardo	c12db594e3	fix typo (#2606 )	2024-11-23 08:40:00 +01:00
Laurent Mazare	f86f4d6224	Tweak the CI to avoid running out of disk space. (#2630 ) * Tweak the CI to avoid running out of disk space. * Linux only.	2024-11-19 04:32:36 +01:00
zachcp	3159f91b90	20241118 docs (#2629 ) * module docs * varbuilder gguf docs * add a link to gguf files * small additonal mod doc titles * safetensor docs * more core docs * more module docs in canlde_core * 2 more link fixes	2024-11-19 04:07:07 +01:00

1 2 3 4 5 ...

2281 Commits