candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 02:38:10 +00:00

Author	SHA1	Message	Date
Anubhab Bandyopadhyay	a01aa89799	onnx: ReduceMin/Max Ops (#2563 ) * Stella_en_1.5B_v5 * Separated creation. This is a critical step for numerical accuracy and would be documented in the readme * EmbedDim would require clone and copy * WIP: example * Examples added * a litte more in README * WIP: ONNX Reduce-max ops * WIP: tests for ReduceMin * Reduce min/ max v18+ * Reformatting tests for better review readability * Error on empty set, backward compatibility (13 and below) with 'axes'	2024-10-15 10:34:07 +02:00
Laurent Mazare	3d1dc06cdb	Enable stable-diffusion 3 on metal. (#2560 )	2024-10-14 08:59:12 +02:00
Anubhab Bandyopadhyay	f553ab5eb4	Adds support for Stella_en_v5 embedding model - 1.5B variant (#2551 ) * Stella_en_1.5B_v5 * Separated creation. This is a critical step for numerical accuracy and would be documented in the readme * EmbedDim would require clone and copy * WIP: example * Examples added * a litte more in README	2024-10-13 23:09:12 +02:00
Mikarific	41ade774e8	fix: Allow marian configs to deserialize from json. (#2556 )	2024-10-13 23:05:50 +02:00
Czxck001	6eab6b57f5	Fix the guide to gain access to Stable Diffusion 3 Medium (#2559 )	2024-10-13 22:55:26 +02:00
Czxck001	ca7cf5cb3b	Add Stable Diffusion 3 Example (#2558 ) * Add stable diffusion 3 example Add get_qkv_linear to handle different dimensionality in linears Add stable diffusion 3 example Add use_quant_conv and use_post_quant_conv for vae in stable diffusion adapt existing AutoEncoderKLConfig to the change add forward_until_encoder_layer to ClipTextTransformer rename sd3 config to sd3_medium in mmdit; minor clean-up Enable flash-attn for mmdit impl when the feature is enabled. Add sd3 example codebase add document crediting references pass the cargo fmt test pass the clippy test * fix typos * expose cfg_scale and time_shift as options * Replace the sample image with JPG version. Change image output format accordingly. * make meaningful error messages * remove the tail-end assignment in sd3_vae_vb_rename * remove the CUDA requirement * use default_value in clap args * add use_flash_attn to turn on/off flash-attn for MMDiT at runtime * resolve clippy errors and warnings * use default_value_t * Pin the web-sys dependency. * Clippy fix. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-10-13 22:08:40 +02:00
SethWen	0d96ec31e8	feat: intergrate chinese clip and add example (#2555 ) * start to impl chinese clip * impl vision model * copy code from bert * refactor use * refactor use again * fix text model * refactor * try to fix text model * tuning * tuning chinese clip * delete useless code * revert code * Clippy fixes. * Also apply cargo fmt. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-10-10 15:18:55 +02:00
Akshay Ballal	937e8eda74	Add BertForMaskedLM to support SPLADE Models (#2550 ) * add bert for masked lm * working example * add example readme * Clippy fix. * And apply rustfmt. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-10-07 23:28:21 +02:00
Jorge António	edf7668291	improve (#2548 )	2024-10-07 17:30:56 +02:00
Laurent Mazare	e4a96f9e7c	Switch to using the MLX matmul by default. (#2547 )	2024-10-06 23:24:55 +02:00
Laurent Mazare	f856b5c3a7	pyo3 update. (#2545 ) * pyo3 update. * Stub fix.	2024-10-06 10:09:38 +02:00
Laurent Mazare	d2e432914e	Tensor tools print all (#2543 ) * Support whisper large-v3 turbo in the whisper-microphone example. * Print all tensors when no argument is provided.	2024-10-05 10:05:14 +02:00
dengelt	410c89f72a	Add required feature for whisper example in Readme (#2539 )	2024-10-04 14:29:55 +02:00
Laurent Mazare	56aacb05da	Make the RNN configs accessible from the models. (#2541 )	2024-10-04 14:22:23 +02:00
Laurent Mazare	6faecaa616	Fix for cudnn bf16 conv2d. (#2535 )	2024-10-02 23:18:55 +02:00
Laurent Mazare	90d04ff622	Support whisper large-v3 turbo in the whisper-microphone example. (#2533 )	2024-10-02 22:09:14 +02:00
Laurent Mazare	7b60bda4ed	Add support for cuda streams. (#2532 )	2024-10-02 21:30:58 +02:00
Laurent Mazare	936300678d	Add whisper large-v3 turbo to the example. (#2531 )	2024-10-02 21:07:08 +02:00
Laurent Mazare	f479840ce6	Add a seed to the flux example. (#2529 )	2024-10-02 10:52:02 +02:00
Laurent Mazare	fd08d3d0a4	Tweak some metal tests. (#2528 )	2024-10-02 10:22:31 +02:00
Anubhab Bandyopadhyay	a2bcc227df	Efficient implementation of `Tensor::ones()` for `metal` (#2512 ) * WIP: hopefully better const impl * with GPU * More tests on * Reverting primitive for * Incorporating review changes - added check elem count check in kerner, using for call strategy * rustfmt ran	2024-10-01 19:11:59 +02:00
Laurent Mazare	def4c6cdee	Cuda quantized mmv bugfix. (#2526 )	2024-10-01 12:57:55 +02:00
Akshay Ballal	888d886dd8	Add ColPali (#2524 ) * add colpali * cleanup * fix clippy	2024-10-01 11:48:39 +02:00
Laurent Mazare	6110ad8d4f	Refactor the whisper microphone example. (#2523 ) * Refactor the whisper microphone example. * Tweak the whisper microphone example more.	2024-10-01 00:24:17 +02:00
Justin Sing	aa35bf2ff5	Add/lstm direction (#2455 ) * add: direction for lstm layer * lint: remove unused Error import * refactor: remove unnecessary int assignment to Direction enum: * refactor: use &'static str type instead of String for direction_str: * Run cargofmt. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-09-30 22:44:07 +02:00
Laurent Mazare	724650446c	Yet another cuda qmm padding fix. (#2509 )	2024-09-30 21:53:30 +02:00
Laurent Mazare	dfe9a00683	Pixtral polishing. (#2522 ) * Pixtral polishing. * Clippy fix.	2024-09-30 21:23:54 +02:00
Laurent Mazare	683ab698de	Add Pixtral. (#2521 ) * Add Pixtral. * More pixtral vision encoder. * Sketch a pixtral example. * Sketch a pixtral example. * Better image loading. * Support loading images embedded in safetensor files. * Clippy fixes. * Add the llava multimodal adapter. * Add more of the llava bits. * Add the pixtral config. * More pixtral inference. * Add the text generation bits. * Get the example to work. * Bugfix. * Run some bits of the model in f32. * Blessed version :) * Better rope frequency computations. * README update.	2024-09-30 19:31:14 +02:00
Laurent Mazare	2f49e1b534	Add PaliGemma. (#2519 ) * Add PaliGemma. * PaliGemma inference loop. * Running PaliGemma example. * Tweak the prompt.	2024-09-29 19:56:56 +02:00
Laurent Mazare	0ebb38813b	Paligemma siglip vision config (#2518 ) * Add the paligemma siglip vision config. * More paligemma configs.	2024-09-29 17:53:52 +02:00
Laurent Mazare	3a3c48b14b	Bump the crate version to 0.7.2. (#2517 ) 0.7.2	2024-09-29 10:56:50 +02:00
Laurent Mazare	261ed65f36	Add the SigLIP model. (#2515 ) * Add the SigLIP model. * Add more to the forward pass of the vision model. * Complete the forward pass. * Add the siglip example. * Fix. * Another fix. * Get everything in place. * Add a readme.	2024-09-28 23:48:00 +02:00
Laurent Mazare	62525e8352	Remove some extra whitelines. (#2513 )	2024-09-28 14:41:28 +02:00
Laurent Mazare	2c25754281	Clippy fixes for onnx + fix a broken test. (#2510 )	2024-09-26 23:37:59 +02:00
Steven Lovegrove	ed48f54b54	Expand split ops (#2505 ) * candle-onnx: Add Split and Expand operators, Fix Where Op Implemented based on https://github.com/onnx/onnx/blob/main/docs/Operators.md Test cases based on those examples. TODO: Should add the remaining Split examples as tests TODO: Add.test case that motivates Where fix * candle-onnx: Add ReduceSum operator Implemented based on https://github.com/onnx/onnx/blob/main/docs/Operators.md Test cases based on those examples. TODO: Should add the remaining ReduceSum examples as tests * candle-onnx: Add ReduceL2 operator Implemented based on https://github.com/onnx/onnx/blob/main/docs/Operators.md Test cases based on those examples. TODO: Should add the remaining ReduceSum examples as tests * candle-onnx: Fix Clip operator empty string as default arg issue Optional input args may be signified by an empty string. The length of the input array is not enough because non optional args may follow optional ones. I encountered this when trying to use the ONNX model found at https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 for example. The LSTM op has a utility which I factored to be more generally accessible, and I have used it in the ops I have recently created or debugged. I believe it is likely that this issue may also manifest in other ops, but I didn't want to change anything that I'm not testing. * fix formatting * fix small mistake made during refactor	2024-09-26 22:57:55 +02:00
Laurent Mazare	ad8a4c5e5a	Add some llama-3.2 examples. (#2508 ) * Add some llama-3.2 examples. * Support tie-word-embeddings for llama.	2024-09-26 21:00:18 +02:00
Guillaume LEGENDRE	c3c392f45c	Merge pull request #2507 from huggingface/ci-move move CI/Cuda runner	2024-09-26 18:48:52 +02:00
Guillaume LEGENDRE	a0184a4fe4	move CI/Cuda runner	2024-09-26 17:09:26 +02:00
Laurent Mazare	10d47183c0	Quantized version of flux. (#2500 ) * Quantized version of flux. * More generic sampling. * Hook the quantized model. * Use the newly minted gguf file. * Fix for the quantized model. * Default to avoid the faster cuda kernels.	2024-09-26 10:23:43 +02:00
Laurent Mazare	d01207dbf3	Add a RotatingKVCache. (#2493 ) * Add a RotatingKVCache. * Add some KvCache tests. * Test the reset too. * More kv-cache testing. * More tests for the rotating kv-cache. * Improve the api for the rotating cache so that the whole src tensor gets returned when it's overlarge. * Handle contiguity + bugfix + use in mimi. * Add a way to test the mimi streaming mode. * Mimi streaming fixes. * More rotating kv-cache. * Fix the attn mask generation. * Handle the abs case. * Add some tests for the generated mask. 0.7.1	2024-09-23 13:14:32 +02:00
Laurent Mazare	8097559c1a	Move the candle version to 0.7.1. (#2495 )	2024-09-22 20:44:39 +02:00
Laurent Mazare	829dcfa8dc	Update cudarc to 0.12.1. (#2494 )	2024-09-22 20:32:29 +02:00
Laurent Mazare	c2fca0ca11	Bump the crate version. (#2491 ) 0.7.0	2024-09-21 15:13:12 +02:00
Laurent Mazare	844d45cde4	Bugfix for the metal elu kernel. (#2490 ) * Bugfix for the metal elu kernel. * Add a test.	2024-09-21 15:03:19 +02:00
Laurent Mazare	af2104078f	Metal commands refactoring (#2489 ) * Split out the commands part of the metal device. * Make most fields private. * Move the allocator back. * Rework the encoder provider type.	2024-09-21 13:18:42 +02:00
Juan Gomez	5fc4f17727	Adding Granite 7b Instruct model example (#2487 ) * Adding Granite 7b Instruct model example * Minor refactoring to make it a little more idiomatic * Clippy fixes. * * Adding a README with some information about supported Granite models * Changing the default prompt to accomodate better the Language modality of the Granite 7b Instruct model --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-09-21 11:52:01 +02:00
Laurent Mazare	c58c5d5b01	Add the mimi audio-tokenizer. (#2488 ) * Add the mimi audio-tokenizer. * Formatting tweaks. * Add a full example. * Use the transformers names. * More renamings. * Get encoding and decoding to work. * Clippy fixes.	2024-09-20 14:31:20 -06:00
ivnsch	382c6b51af	Improve error message (#2485 )	2024-09-20 07:11:41 -06:00
Laurent Mazare	6eea45a761	Add a couple cast metal kernels. (#2479 )	2024-09-15 22:27:46 +02:00
Shengtuo Hu	ebf722b446	Export TensorIndexer public to candle users (#2477 )	2024-09-13 22:21:57 +02:00

1 2 3 4 5 ...

2204 Commits