candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-14 18:06:36 +00:00

Author	SHA1	Message	Date
Akshay Ballal	17313a4226	Fix cuda memory error for Qwen3 non-quantized (#2987 ) * Update KvCache initialization in Qwen3 model to use a fixed max position embedding value of 512 * add doc	2025-06-07 16:02:58 +02:00
Kyle Birnbaum	0224a749f0	Add Qwen3 MoE (#2934 ) * qwen-moe rebase * lint * fixed rebase error * swapped normal MoE model with CausalMoE Model in example, and swapped the tie word embeddings if statement * updated readme	2025-05-31 15:33:28 +02:00
Laurent Mazare	61ddb9535e	Use a tanh activation in the xlm-roberta classification head. (#2968 )	2025-05-26 08:54:31 +02:00
Laurent Mazare	9a62c91643	Proper support for phi-4 (#2960 ) * Add phi-4 support. * Long-rope support. * Get clippy to be happy.:	2025-05-21 10:18:33 +02:00
Laurent Mazare	92106c8762	Fixes for clippy 1.87. (#2956 )	2025-05-15 21:50:27 +02:00
Jani Monoses	450a49ed1a	Olmo 2 model (#2954 ) * OLMo 2 model * Update olmo-2 to example * Clippy fix. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2025-05-14 19:18:02 +02:00
Borek Požár	6bd61727bc	Make tensor contiguous before the repeat_kv calls to avoid strided copies (#2953 )	2025-05-14 10:47:28 +02:00
Snake	485ddf2996	Fixed Quantized Qwen3 Model (#2951 ) * optimize KV cache to reduce GPU memory usage * revert to using candle_nn::kv_cache::KvCache with initial capacity of 512	2025-05-13 05:53:42 +02:00
Lucien Thomas	3d05f5cf3d	Qwen3 quantized implementation (#2939 ) * fixed quantized_phi3 implementation * quantized_qwen3 implementation * Update quantized_phi3.rs * Update quantized_phi3.rs * add quantized_qwen3 example * Clippy fixes. * Cleanup. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2025-05-08 15:06:10 +02:00
Kyle Birnbaum	1fdfb58de5	Updating `Add qwen3` (PR 2903) to use HF weights (#2930 ) * add Qwen3.rs * fixed compile error * attempting to gett pr 2903 working with qwen weights * different qwen variants working * added moe model * clippy * added additional eos token * translated Korean comments to English as well as I can * removed specialized Qwen3RmsNorm and replaced with generic Candle RmsNorm * replaced custom repeat_kv implementation with candle's repeat_kv implementation * replace linear with linear_b in attention initalization * replaced custom custom kv_cache implementation with candle kv_cache * style * replaced explicit broadcast add with normal add in decoder layer * removed keeping the Rotary embedding layer in the model struct * used tie_word_embeddings bool from config instead of relying on existence of weights for lm head in CasualLM * removed duplicate code from qwen3_moe * removed sliding window from qwen3 attention * removed MoE code * removed unused option * Fixed Typo Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com> * fixed tie word embeddings to use the correct embedding weights instead of the opposite --------- Co-authored-by: Max <naturale@hufs.ac.kr> Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>	2025-05-02 06:05:53 +02:00
Kyle Birnbaum	3aeb9575c7	Fixed Quantized Gemma3 Model and example (#2918 ) * removed scale factor from computation and made quantized gemma3 work similarly to non-quantized gemma3 * created default consts, replaced is_sliding with Option holding a window_size	2025-04-25 05:47:48 +02:00
Kyle Birnbaum	6ff0a6999c	Fixed Gemma3 model and example (#2917 ) * gemma3: changed RotaryEmbedding base freq based on layer and sliding window * Changed attention mask per layer, either normal or sliding * made attention mask creation slightly more efficient by only creating them once per model iteration * changed is_sliding to an Option * clippy * changed to stop on both <eos> and <end_of_turn> instead of either or	2025-04-25 05:35:08 +02:00
Kyle Birnbaum	99bd69f383	fixed quantized-gemma example (#2914 ) * fixed quantized-gemma example * lint	2025-04-23 05:39:03 +02:00
Kyle Birnbaum	b2904a830b	implemented quantized-gemma3 (#2902 ) * implemented quantized-gemma, inference not working * Fixed a few modeling bugs: outputing the correct tokens for a few iterations then garbage * lint * clippy * quantized-gemma3 example working * added readme * clippy	2025-04-19 07:46:41 +02:00
Laurent Mazare	2653002f29	Gumbel-Softmax sampling. (#2894 ) * Gumbel-Softmax sampling. * Add a sampling test. * Share the gumbel-softmax bits.	2025-04-14 15:42:42 +02:00
Laurent Mazare	a52b76ae82	Expose the cudnn algo in the conv ops. (#2892 ) * Set the algo. * Expose the cudnn preferred algo for conv ops.	2025-04-14 08:25:32 +02:00
Laurent Mazare	fb660b8d43	Add a cudnn feature to candle-nn/candle-transformers. (#2890 )	2025-04-13 17:43:41 +02:00
Kyle Birnbaum	eb478ece92	Implementing DistilBertForMaskedLM. (#2866 ) * Initial commit: model weights working, prediciton incorrect * moved distilbertformaskedlm into distilbert modeling file * made maskedLM like bert example, still incorrect predictions * finally not getting NaNs, fixed attention mask * getting correct output sentences * get top k predictions * fixed output formatting slightly * added default arg for model_id * lint * moved masked token example code from distilbertformaskedlm example to distilbert example * lint * removed distilbertformaskedlm example * cleanup * clippy * removed embedding normalization from example * made output and model dependent on args instead of prompt * lint * replaced or_ok anyhow error with anyhow context * changed error message for mask token not found	2025-04-11 13:25:39 +02:00
Manpreet Singh	d339b01726	Fix hardcoded f32 dtype for attention_mask. Use the model dtype for compatibility. (#2872 )	2025-04-08 06:12:14 +02:00
Laurent Mazare	e3370c6316	Add the SNAC audio tokenizer. (#2869 ) * Add the SNAC audio tokenizer. * More snac. * Again more snac. * Add some example code for snac. * Get the weights to load. * Add to the snac model. * Fixes. * Get round-tripping to work. * Save/load code files. * Clippy fix. * Fmt fix.	2025-04-06 22:15:36 +02:00
Laurent Mazare	cf9d7bf24c	Add the CSM model. (#2862 ) * Add the CSM model. * Add some code to load the model. * Load the text tokenizer. * Add frame generation. * Get the sampling to work. * Rope fix. * Autoregressive generation. * Generate some audio file. * Use the actual prompt. * Support multiple turns. * Add a very barebone readme. * Move some of the shared bits to the model.	2025-04-04 06:48:03 +02:00
Laurent Mazare	9d31361c4f	Fix for clippy 1.86. (#2864 ) * Fix for clippy 1.86. * More clippy fixes. * More fixes.	2025-04-03 19:38:27 +02:00
Kyle Birnbaum	d6db305829	Added new language pairs to marian-mt example. (#2860 ) * added new language pairs to marian-mt * lint * seperated python code for converting tokenizers into its own file and and added a reqirements.txt for dependencies, updated instructions in readme and included python version * Cleanup. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2025-04-02 23:50:14 +02:00
Mike Seddon	c930ab7e1a	upgrade half library to fix rand (#2806 ) fix lints	2025-03-14 09:01:54 +01:00
Laurent Mazare	111edbc4ea	Gemma 3 initial setup (text only). (#2802 ) * Gemma 3 initial setup (text only). * Use the rotating kv cache for the sliding window.	2025-03-14 07:56:02 +01:00
Laurent Mazare	e286cf7cc9	Parse the json config for siglip models. (#2800 ) * Parse the json config for siglip models. * Bump the tokenizers dependency. * Add a v2 model. * Support more v2 model.s	2025-03-09 14:01:09 +01:00
Mikhail Panfilov	e4ffb85228	Add ModernBert sentency classifier (#2796 )	2025-03-08 14:48:22 +01:00
Andrew Wason	37db86ff79	Allow ModernBert to be used to generate embeddings. (#2791 )	2025-03-03 12:39:04 +01:00
Eric Buehler	e6cc76fc37	Implement DeepSeek V2 (#2744 ) * Add deepseek v2 * Fix * Remove unused * Add kv cache * Remove from cargo.toml * Fix dtype selection logic * Fix unnecessary u32->f32->gather->u32 * Remove fromstr impl * Use local scopes for some clarity * Typo * Repeat k_pe * Chain calls to remove mut * Actually, remove all muts * Update readme	2025-02-19 10:51:01 +01:00
Amélie Royer	2423d633fc	add dynamic position encoding to Siglip (#2770 ) * add dynamic position encoding * remove debug messages	2025-02-14 13:50:50 +01:00
Brady Bonnette	43017539ab	Adds DebertaV2/V3 (#2743 ) * Adds DebertaV2/V3 * Fixes all clippy warnings * Typos. * Addresses PR review findings. Some refactorings * Avoid some unwrap/unwrap_or. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2025-01-29 08:59:28 +01:00
唐璜	333d94a19a	fix: fix the codegeex4 model examples and transformers model (#2738 ) * Update main.rs * Update codegeex4_9b.rs * Get things to compile. * Add some default for when rope_ratio is missing. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2025-01-25 17:41:12 +01:00
Guoqing Bao	e4c3a71f11	Fix GLM4 alignment issue (#2723 ) * Fix GLM4 alignment issue * Cleanups. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2025-01-20 22:51:46 +01:00
Laurent Mazare	309cd0f7c7	Add the helium model. (#2715 )	2025-01-13 17:39:49 +01:00
Jani Monoses	ab7ff7081e	Fixes for running Phi-4 quantized. (#2714 )	2025-01-13 14:35:33 +01:00
Jani Monoses	461e8c1685	ModernBERT model (#2713 ) * layer_norm_no_bias * Modernbert model. * Format + cleanup error. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2025-01-13 08:39:27 +01:00
Luka Zakrajšek	57f41da13b	Fix mistral attention on Metal (#2699 ) Co-authored-by: Luka Zakrajsek <luka.zakrajsek@soniox.com>	2025-01-04 16:11:20 +01:00
Nick Senger	cbaa0ad46f	UniPC for diffusion sampling (#2684 ) * feat: Add unipc multistep scheduler * chore: Clippy and formatting * chore: Update comments * chore: Avoid unsafety in float ordering * refactor: Update Scheduler::step mutability requirements * fix: Corrector img2img * chore: Update unipc ref link to latest diffusers release * chore: Deduplicate float ordering * fix: Panic when running with dev profile	2025-01-01 21:34:17 +01:00
Akshay Ballal	91f1f019b1	Added XLMRobertaModel for Reranking (#2686 ) * add xlm-roberta-base * Add task enum for fill-mask and reranker in xlm-roberta example; update README and fix attention mask dimensions - Introduced a new `Task` enum to replace string task identifiers in the xlm-roberta example. - Updated the logic in `main.rs` to handle tasks using the new enum. - Enhanced README with example output for fill-mask task. - Fixed dimension retrieval in `prepare_4d_attention_mask` function for better clarity and safety. * Clippy fix. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-12-30 11:16:57 +01:00
mert-kurttutan	cd639131f0	Fix bug in whisper transformer (#2681 ) * Fix bug in whisper transformer - due to num_threads going to zero in single threaded case * Apply rustfmt. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-12-24 13:58:21 +01:00
Amélie Royer	1be6b090c7	Fix position encodings for Pixtral (#2678 ) * init commit: add position id in meshgrid * pass in subsampled positions * clippy fix * clippy fix	2024-12-23 13:22:35 +01:00
Laurent Mazare	62ced44ea9	Add a Context trait similar to anyhow::Context. (#2676 ) * Add a Context trait similar to anyhow::Context. * Switch two unwrap to context.	2024-12-22 09:18:13 +01:00
Edgar Riba	5c2f893e5a	make DepthAnythingV2 more reusable (#2675 ) * make DepthAnythingV2 more reusable * Fix clippy lints. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-12-21 12:06:03 +01:00
Justin Sing	1807be84f4	Change/bert encoder public (#2658 ) * change: BertEncoder struct to public * change: make certain fields in Config struct public * change: all fields in bert config struct to be public * change: add clone to bert encoder and others * Clippy fix. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-12-04 21:22:30 +01:00
cdoko	145aa7193c	Add Nvembed v2 model (#2649 ) * Update mod.rs * Create mod.rs * Create decoder.rs * Create model.rs * Create main.rs * Create README.md * Update README.md * Update main.rs * Update and rename decoder.rs to embedding.rs * Update mod.rs * Update model.rs	2024-12-03 10:56:01 +01:00
iskng	4f59ed38b0	Adds support for stella_en_v5 embedding model -400M variant (#2608 ) * Adds support for stella_en_v5 embedding model -400M variant * Unified stella * WIP: Unified Stella * Combined stella for both 1.5B and 400M variants * Cargo fmt for the CI * removed redundant stella-400m model and example after merge into stella-en-v5 * cargo fmt --all --------- Co-authored-by: Anubhab Bandyopadhyay <4890833+AnubhabB@users.noreply.github.com> Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-11-29 09:01:08 +01:00
Anubhab Bandyopadhyay	54e7fc3c97	Lint fixes introduced with Rust 1.83 (#2646 ) * Fixes for lint errors introduced with Rust 1.83 * rustfmt * Fix more lints. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-11-28 23:00:21 +01:00
zachcp	3159f91b90	20241118 docs (#2629 ) * module docs * varbuilder gguf docs * add a link to gguf files * small additonal mod doc titles * safetensor docs * more core docs * more module docs in canlde_core * 2 more link fixes	2024-11-19 04:07:07 +01:00
Laurent Mazare	e86565624b	Fix for clippy. (#2626 )	2024-11-18 14:32:38 +01:00
zachcp	386fd8abb4	Module Docs (#2624 ) * update whisper * update llama2c * update t5 * update phi and t5 * add a blip model * qlamma doc * add two new docs * add docs and emoji * additional models * openclip * pixtral * edits on the model docs * update yu * update a fe wmore models * add persimmon * add model-level doc * names * update module doc * links in heira * remove empty URL * update more hyperlinks * updated hyperlinks * more links * Update mod.rs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>	2024-11-18 14:19:23 +01:00

1 2 3 4 5 ...

409 Commits