candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	261ed65f36	Add the SigLIP model. (#2515 ) * Add the SigLIP model. * Add more to the forward pass of the vision model. * Complete the forward pass. * Add the siglip example. * Fix. * Another fix. * Get everything in place. * Add a readme.	2024-09-28 23:48:00 +02:00
Laurent Mazare	62525e8352	Remove some extra whitelines. (#2513 )	2024-09-28 14:41:28 +02:00
Laurent Mazare	ad8a4c5e5a	Add some llama-3.2 examples. (#2508 ) * Add some llama-3.2 examples. * Support tie-word-embeddings for llama.	2024-09-26 21:00:18 +02:00
Laurent Mazare	10d47183c0	Quantized version of flux. (#2500 ) * Quantized version of flux. * More generic sampling. * Hook the quantized model. * Use the newly minted gguf file. * Fix for the quantized model. * Default to avoid the faster cuda kernels.	2024-09-26 10:23:43 +02:00
Laurent Mazare	d01207dbf3	Add a RotatingKVCache. (#2493 ) * Add a RotatingKVCache. * Add some KvCache tests. * Test the reset too. * More kv-cache testing. * More tests for the rotating kv-cache. * Improve the api for the rotating cache so that the whole src tensor gets returned when it's overlarge. * Handle contiguity + bugfix + use in mimi. * Add a way to test the mimi streaming mode. * Mimi streaming fixes. * More rotating kv-cache. * Fix the attn mask generation. * Handle the abs case. * Add some tests for the generated mask.	2024-09-23 13:14:32 +02:00
Juan Gomez	5fc4f17727	Adding Granite 7b Instruct model example (#2487 ) * Adding Granite 7b Instruct model example * Minor refactoring to make it a little more idiomatic * Clippy fixes. * * Adding a README with some information about supported Granite models * Changing the default prompt to accomodate better the Language modality of the Granite 7b Instruct model --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-09-21 11:52:01 +02:00
Laurent Mazare	c58c5d5b01	Add the mimi audio-tokenizer. (#2488 ) * Add the mimi audio-tokenizer. * Formatting tweaks. * Add a full example. * Use the transformers names. * More renamings. * Get encoding and decoding to work. * Clippy fixes.	2024-09-20 14:31:20 -06:00
Laurent Mazare	e3261216b1	Clippy fixes for 1.81.0. (#2461 ) * Clippy fixes for 1.81.0. * Another fix.	2024-09-05 23:46:55 +02:00
Jani Monoses	86613c00e2	MobileCLIP models S1 and S2 (#2454 ) * Allow loading images with given std and mean * OpenCLIP text encoder component * Two MobileCLIP models * Clippy fixes. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-08-29 15:38:58 +02:00
Jani Monoses	29e25c458d	FastViT fixes. (#2452 ) * correct optional SE layer dimensions. * head_dim instead of num_heads is 32. * update test example output.	2024-08-28 11:20:09 +02:00
ilookee	fdc2622686	fix: qwen2 lm_head loading #2443 (#2445 ) Co-authored-by: Yi Xu <xuyi@me.com>	2024-08-23 16:50:02 +02:00
Jani Monoses	ccdbe87639	Add FastViT model. (#2444 )	2024-08-23 16:06:54 +02:00
Laurent Mazare	2ec8729d51	Fix for parler-tts, do not add the last slice of padding tokens. (#2442 ) * Fix for parler-tts, do not add the last slice of padding tokens. * Support for the mini model.	2024-08-22 23:22:03 +02:00
Laurent Mazare	236b29ff15	Add the DAC model. (#2433 ) * Add the DAC model. * More quantization support. * Handle DAC decoding. * Plug the DAC decoding in parler-tts.	2024-08-19 08:59:51 +02:00
Laurent Mazare	58197e1896	parler-tts support (#2431 ) * Start sketching parler-tts support. * Implement the attention. * Add the example code. * Fix the example. * Add the description + t5 encode it. * More of the parler forward pass. * Fix the positional embeddings. * Support random sampling in generation. * Handle EOS. * Add the python decoder. * Proper causality mask.	2024-08-18 20:42:08 +02:00
Laurent Mazare	c1b9e07e35	Add support for gemma-2. (#2425 ) * Add gemma-2. * Support a couple more models. * Sliding window support. * Example + readme updates. * Update the main readme.	2024-08-17 20:31:23 +02:00
Laurent Mazare	68aa9c7320	Fix the device for the bert attention mask. (#2414 )	2024-08-14 10:01:12 +02:00
Jani Monoses	35e5f31397	Add Based LLM from Hazy Research. (#2411 )	2024-08-12 21:21:19 +02:00
Matthew O'Malley-Nichols	14db029494	Soft Non-Maximum Suppression (#2400 ) * Soft NMS with thresholds * NMS Test * Soft nms w/ boxes removed below threshold * Soft nms test * No longer removing bounding boxes to fit Soft-NMS focus * Initialize confidence * Added comments * Refactored out updating based on IOU/sigma * Score_threshold -> confidence_threshold for clarity * Remove bboxes below confidence threshold * Softnms basic functionality test * Softnms confidence decay test * Softnms confidence threshold test * Softnms no overlapping bbox test * Testing confidence after no overlap test * Single bbox and no bbox tests * Signify test completion * Handling result of test functions * Checking all pairs of bboxes instead of a forward pass * Equal confidence overlap test * Clarified tests for implementation * No longer dropping boxes, just setting to 0.0 * Formatted w/ cargo	2024-08-10 07:57:52 +02:00
Czxck001	dfdce2b602	Add the MMDiT model of Stable Diffusion 3 (#2397 ) * add mmdit of stable diffusion 3 lint add comments * correct a misplaced comment * fix cargo fmt * fix clippy error * use bail! instead of assert! * use get_on_dim in splitting qkv	2024-08-05 19:26:15 +02:00
唐璜	500c9f2882	add models support and example for THUDM/glm-4 (#2362 ) * add models support and example for THUDM/glm-4 * fix the ci report * fmt * fix * Update README.org * Update README.org * fmt * Update README.org * README.md add codegeex4 * README.md add glm4 * Typo. * change expect into ? --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>	2024-08-05 17:48:09 +02:00
Laurent Mazare	2be9bd211e	Support for mistral-nemo. (#2396 )	2024-08-04 19:52:40 +02:00
Laurent Mazare	aa7ac1832d	Simplify handling of flux modulations. (#2394 )	2024-08-04 11:09:54 +02:00
Laurent Mazare	19db6b9723	Add the flux model for image generation. (#2390 ) * Add the flux autoencoder. * Add the encoder down-blocks. * Upsampling in the decoder. * Sketch the flow matching model. * More flux model. * Add some of the positional embeddings. * Add the rope embeddings. * Add the sampling functions. * Add the flux example. * Fix the T5 bits. * Proper T5 tokenizer. * Clip encoder path fix. * Get the clip embeddings. * No configurable weights in layer norm. * More weights related fixes. * Yet another shape fix. * DType fix. * Fix a couple more shape issues. * DType fixes. * Fix the latent dims. * Fix more shape issues. * Autoencoder fixes. * Get some generations out. * Bugfix. * T5 padding. * Clippy fix. * Add the decode only mode. * Fix. * More fixes. * Finally get some generations to work. * Add readme.	2024-08-04 08:14:33 +02:00
Laurent Mazare	9ca277a9d7	Fix cargo fmt. (#2383 ) * Fix cargo fmt. * Clippy fix. * Cosmetic tweaks.	2024-08-01 14:19:41 +02:00
Joan Fontanals	2e9c010609	Jina Bert Example fix and more configuration (#2191 ) * fix: fix jina bert example logic * feat: enable jina embeddings de * feat: allow more flexibility on Jina Bert	2024-08-01 13:59:20 +02:00
Jani Monoses	ac51f477eb	Add Hiera vision model. (#2382 )	2024-08-01 11:59:22 +02:00
Zheng Li	4a52aeb437	bert attention mask (#1934 ) * bert attention mask * Allow for using None as a mask. * Revert part of the changes so that the proper default mask applies. * Cosmetic change. * Another cosmetic tweak. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-08-01 08:26:19 +02:00
Eric Buehler	0f5cbb08b3	Add support for Llama 3.1 (#2359 ) * Add Llama 3.1 rope * Clippy * Format * Clippy * Add support for multiple eos tokens: * Untagged either * Remove either dep and fix settings.json * Make the max positional embeddings configurable	2024-07-26 21:32:26 +02:00
donjuanplatinum	2489a606fe	feat(candle-transformers/models/codegeex4-9b): add codegeex4-9 (#2334 ) * feat(candle-transformers/models/codegeex4-9b): add codegeex4-9b transoformers * change mod.rs * feat(candle-examples/codegeex4-9b) * Update codegeex4_9b.rs * Update main.rs * Update codegeex4_9b.rs * Update main.rs * fmt * fix * fmt * Clippy fix. * Remove some print statements. * Avoid using unwrap. * 1. add README 2. change the print fmt * Another clippy fix. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-07-21 13:00:41 +02:00
Zhuo Jinggang	c63048d374	add quantized qwen2 (#2329 ) * add quantized version of qwen2 and corresponding example for qwen2-instruct * fix quantized qwen2 clippy error	2024-07-12 10:00:03 +02:00
Jani Monoses	a226a9736b	Add Mobilenet v4 (#2325 ) * Support different resolutions in load_image() * Added MobilenetV4 model. * Add MobileNetv4 to README	2024-07-09 13:52:20 +02:00
v-espitalier	9cd54aa5d4	Add EVA-02 model ( https://arxiv.org/abs/2303.11331 ) (#2311 ) * Add EVA-02 model ( https://arxiv.org/abs/2303.11331 ) * Clippy fix. * And apply fmt. --------- Co-authored-by: v-espitalier <> Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-07-07 20:09:31 +02:00
v-espitalier	ecff05d72b	Beit: Add the gen_relative_position_index() function (#2306 ) Co-authored-by: v-espitalier <>	2024-07-04 09:45:26 +02:00
v-espitalier	7f1ba8038c	Add Beit model ( https://arxiv.org/abs/2106.08254 ) (#2305 ) Co-authored-by: v-espitalier <>	2024-07-01 22:11:48 +02:00
v-espitalier	e27aac0a06	Add DINOv2Reg4 + PlantCLEF2024 (#2293 ) * Add: DINOv2Reg4 with PlantCLEF2024 weights and example ( See https://arxiv.org/abs/2309.16588 and https://zenodo.org/records/10848263 ) * Remove extra files + update README to download them + remove extra lines * minor fix (README remove extra spaces) * minor fix (README: Fix image url) * Modif: Add back interpolate_pos_encoding() + fix when no interpolation + remove extra comments + Update README ( source image changed and so the predictions ) * Fix: Improve code lisibility with '$ cargo clippy' and '$ cargo fmt' * Another clippy fix. --------- Co-authored-by: x-VEspit <vincent.espitalier@cirad.fr> Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-06-29 11:49:15 +02:00
Jeroen Vlek	242e006bbb	Depth Anything v2 (#2279 ) * define structs * construct ResidualConvUnit * forward() for ResidualConvUnit * implement FeatureFusionBlock * implement Scratch * implement DPTHead * add identity module * implement forward for DTPHead * add get_intermediate_layers to DinoVisionTransformer * implement DepthAnythingV2 * some minor tweaks * fix compile errors * fix var builder prefixes * setup initial example * use fixed patch size of 37 (518 / 14) * debugged until output * print min and max values * add some dynamism to the output location * scale input image * extract prep function * extract output path function * normalize image with magic mean and std * add spectral coloring * squeeze in the right place * make enterpolation optional * use bail instead of panic * omit unnecessary Shape call * remove empty curly braces * use bail instead of assert * use vb and pp * remove closures * extract config object * Apply rustfmt. * Fix some clippy lints. * More lints. * Use the array methods. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-06-24 19:12:52 +02:00
Laurent Mazare	36cf54525d	Fix the fast bf16 gemm cublas kernels. (#2274 ) * Use flash-attn in gemma. * Fix for the fast bf16 cublas gemm. * Fix some clippy lints. * Fix another lint. * Proper clippy fix.	2024-06-18 23:46:58 +02:00
Laurent Mazare	54ff971e35	Support for the new Qwen2 models. (#2257 ) * Support for the new Qwen2 models. * Add more models.	2024-06-07 10:51:50 +01:00
chenwanqq	cd4d941ed1	Add LLaVA support (#2234 ) * first commit * llava * clippy and fmt * some fixes * minor fixes * remove useless file * refactor: Remove llava/constants.rs and update llava/mod.rs * modify variable name * modify code after clippy * Minor tweaks. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-06-03 11:54:09 +02:00
Dave Lage	ea260aeffd	Add Debug, Clone, Deserialize to moondream config (#2222 )	2024-05-28 06:08:00 +02:00
Laurent Mazare	3ceca9901a	Enable the new layer-norm. (#2213 ) * Enable the new layer-norm. * Shape fixes.	2024-05-24 16:48:21 +02:00
Laurent Mazare	d54e02d73d	Avoid a contiguous call in the quantized phi 3 model. (#2209 ) * Simplify the KvCache api. * Avoid a contiguous call in the quantized phi3 model.	2024-05-23 21:24:55 +02:00
Laurent Mazare	45e235a747	Simplify the KvCache api. (#2207 )	2024-05-23 17:07:21 +02:00
Laurent Mazare	7ebc3548e1	Use flash-attn in gemma. (#2195 ) * Use flash-attn in gemma. * Fix flash-attn for head dim 256.	2024-05-18 19:18:59 +02:00
Laurent Mazare	eefc1c77ef	Support flash-attn in quantized phi3. (#2194 )	2024-05-18 17:12:56 +02:00
Laurent Mazare	01545f7303	Add a slice_set op. (#2193 ) * Add a slice_set op. * Add some testing. * Add the dedicated kv-cache module. * Derive debug and clone. * Expose more kv-cache functions. * Return the current data when appending. * Use the new cache in the quantized phi3 model.	2024-05-18 15:58:18 +02:00
Yin Guobing	349c3e806a	Support embedding model gte-Qwen1.5-7B-instruct (#2190 ) * Support embedding model gte-Qwen1.5-7B-instruct This is a text embedding model based on Qwen2. They share same model architecture except the last MLP module. This commit brings in minimal modification of the old Qwen2 implementation to support both models. An example is provided, and had been verified according to the official PyTorch implementation. * Avoid doing the 'last-token filtering' based on the absence of attention mask. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-05-16 21:34:10 +02:00
Laurent Mazare	b13a82a438	Separate quantized phi-3 implementation. (#2157 ) * Separate quantized phi-3 implementation. * Integrate the quantized phi3 model.= * Small fixes, get the generation to work properly. * Keep the old llama implementation around. * Change the default.	2024-05-04 10:14:57 +02:00
Laurent Mazare	89f53b9d7b	Bump the version number to 0.5.1. (#2155 ) * Bump the version number to 0.5.1. * Fix clippy lints for 1.78. * More clippy fixes.	2024-05-03 11:17:05 +02:00

1 2 3 4 5 ...

386 Commits