candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-15 18:28:24 +00:00

Author	SHA1	Message	Date
Laurent Mazare	90d04ff622	Support whisper large-v3 turbo in the whisper-microphone example. (#2533 )	2024-10-02 22:09:14 +02:00
Laurent Mazare	7b60bda4ed	Add support for cuda streams. (#2532 )	2024-10-02 21:30:58 +02:00
Laurent Mazare	936300678d	Add whisper large-v3 turbo to the example. (#2531 )	2024-10-02 21:07:08 +02:00
Laurent Mazare	f479840ce6	Add a seed to the flux example. (#2529 )	2024-10-02 10:52:02 +02:00
Laurent Mazare	fd08d3d0a4	Tweak some metal tests. (#2528 )	2024-10-02 10:22:31 +02:00
Anubhab Bandyopadhyay	a2bcc227df	Efficient implementation of `Tensor::ones()` for `metal` (#2512 ) * WIP: hopefully better const impl * with GPU * More tests on * Reverting primitive for * Incorporating review changes - added check elem count check in kerner, using for call strategy * rustfmt ran	2024-10-01 19:11:59 +02:00
Laurent Mazare	def4c6cdee	Cuda quantized mmv bugfix. (#2526 )	2024-10-01 12:57:55 +02:00
Akshay Ballal	888d886dd8	Add ColPali (#2524 ) * add colpali * cleanup * fix clippy	2024-10-01 11:48:39 +02:00
Laurent Mazare	6110ad8d4f	Refactor the whisper microphone example. (#2523 ) * Refactor the whisper microphone example. * Tweak the whisper microphone example more.	2024-10-01 00:24:17 +02:00
Justin Sing	aa35bf2ff5	Add/lstm direction (#2455 ) * add: direction for lstm layer * lint: remove unused Error import * refactor: remove unnecessary int assignment to Direction enum: * refactor: use &'static str type instead of String for direction_str: * Run cargofmt. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-09-30 22:44:07 +02:00
Laurent Mazare	724650446c	Yet another cuda qmm padding fix. (#2509 )	2024-09-30 21:53:30 +02:00
Laurent Mazare	dfe9a00683	Pixtral polishing. (#2522 ) * Pixtral polishing. * Clippy fix.	2024-09-30 21:23:54 +02:00
Laurent Mazare	683ab698de	Add Pixtral. (#2521 ) * Add Pixtral. * More pixtral vision encoder. * Sketch a pixtral example. * Sketch a pixtral example. * Better image loading. * Support loading images embedded in safetensor files. * Clippy fixes. * Add the llava multimodal adapter. * Add more of the llava bits. * Add the pixtral config. * More pixtral inference. * Add the text generation bits. * Get the example to work. * Bugfix. * Run some bits of the model in f32. * Blessed version :) * Better rope frequency computations. * README update.	2024-09-30 19:31:14 +02:00
Laurent Mazare	2f49e1b534	Add PaliGemma. (#2519 ) * Add PaliGemma. * PaliGemma inference loop. * Running PaliGemma example. * Tweak the prompt.	2024-09-29 19:56:56 +02:00
Laurent Mazare	0ebb38813b	Paligemma siglip vision config (#2518 ) * Add the paligemma siglip vision config. * More paligemma configs.	2024-09-29 17:53:52 +02:00
Laurent Mazare	3a3c48b14b	Bump the crate version to 0.7.2. (#2517 ) 0.7.2	2024-09-29 10:56:50 +02:00
Laurent Mazare	261ed65f36	Add the SigLIP model. (#2515 ) * Add the SigLIP model. * Add more to the forward pass of the vision model. * Complete the forward pass. * Add the siglip example. * Fix. * Another fix. * Get everything in place. * Add a readme.	2024-09-28 23:48:00 +02:00
Laurent Mazare	62525e8352	Remove some extra whitelines. (#2513 )	2024-09-28 14:41:28 +02:00
Laurent Mazare	2c25754281	Clippy fixes for onnx + fix a broken test. (#2510 )	2024-09-26 23:37:59 +02:00
Steven Lovegrove	ed48f54b54	Expand split ops (#2505 ) * candle-onnx: Add Split and Expand operators, Fix Where Op Implemented based on https://github.com/onnx/onnx/blob/main/docs/Operators.md Test cases based on those examples. TODO: Should add the remaining Split examples as tests TODO: Add.test case that motivates Where fix * candle-onnx: Add ReduceSum operator Implemented based on https://github.com/onnx/onnx/blob/main/docs/Operators.md Test cases based on those examples. TODO: Should add the remaining ReduceSum examples as tests * candle-onnx: Add ReduceL2 operator Implemented based on https://github.com/onnx/onnx/blob/main/docs/Operators.md Test cases based on those examples. TODO: Should add the remaining ReduceSum examples as tests * candle-onnx: Fix Clip operator empty string as default arg issue Optional input args may be signified by an empty string. The length of the input array is not enough because non optional args may follow optional ones. I encountered this when trying to use the ONNX model found at https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 for example. The LSTM op has a utility which I factored to be more generally accessible, and I have used it in the ops I have recently created or debugged. I believe it is likely that this issue may also manifest in other ops, but I didn't want to change anything that I'm not testing. * fix formatting * fix small mistake made during refactor	2024-09-26 22:57:55 +02:00
Laurent Mazare	ad8a4c5e5a	Add some llama-3.2 examples. (#2508 ) * Add some llama-3.2 examples. * Support tie-word-embeddings for llama.	2024-09-26 21:00:18 +02:00
Guillaume LEGENDRE	c3c392f45c	Merge pull request #2507 from huggingface/ci-move move CI/Cuda runner	2024-09-26 18:48:52 +02:00
Guillaume LEGENDRE	a0184a4fe4	move CI/Cuda runner	2024-09-26 17:09:26 +02:00
Laurent Mazare	10d47183c0	Quantized version of flux. (#2500 ) * Quantized version of flux. * More generic sampling. * Hook the quantized model. * Use the newly minted gguf file. * Fix for the quantized model. * Default to avoid the faster cuda kernels.	2024-09-26 10:23:43 +02:00
Laurent Mazare	d01207dbf3	Add a RotatingKVCache. (#2493 ) * Add a RotatingKVCache. * Add some KvCache tests. * Test the reset too. * More kv-cache testing. * More tests for the rotating kv-cache. * Improve the api for the rotating cache so that the whole src tensor gets returned when it's overlarge. * Handle contiguity + bugfix + use in mimi. * Add a way to test the mimi streaming mode. * Mimi streaming fixes. * More rotating kv-cache. * Fix the attn mask generation. * Handle the abs case. * Add some tests for the generated mask. 0.7.1	2024-09-23 13:14:32 +02:00
Laurent Mazare	8097559c1a	Move the candle version to 0.7.1. (#2495 )	2024-09-22 20:44:39 +02:00
Laurent Mazare	829dcfa8dc	Update cudarc to 0.12.1. (#2494 )	2024-09-22 20:32:29 +02:00
Laurent Mazare	c2fca0ca11	Bump the crate version. (#2491 ) 0.7.0	2024-09-21 15:13:12 +02:00
Laurent Mazare	844d45cde4	Bugfix for the metal elu kernel. (#2490 ) * Bugfix for the metal elu kernel. * Add a test.	2024-09-21 15:03:19 +02:00
Laurent Mazare	af2104078f	Metal commands refactoring (#2489 ) * Split out the commands part of the metal device. * Make most fields private. * Move the allocator back. * Rework the encoder provider type.	2024-09-21 13:18:42 +02:00
Juan Gomez	5fc4f17727	Adding Granite 7b Instruct model example (#2487 ) * Adding Granite 7b Instruct model example * Minor refactoring to make it a little more idiomatic * Clippy fixes. * * Adding a README with some information about supported Granite models * Changing the default prompt to accomodate better the Language modality of the Granite 7b Instruct model --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-09-21 11:52:01 +02:00
Laurent Mazare	c58c5d5b01	Add the mimi audio-tokenizer. (#2488 ) * Add the mimi audio-tokenizer. * Formatting tweaks. * Add a full example. * Use the transformers names. * More renamings. * Get encoding and decoding to work. * Clippy fixes.	2024-09-20 14:31:20 -06:00
ivnsch	382c6b51af	Improve error message (#2485 )	2024-09-20 07:11:41 -06:00
Laurent Mazare	6eea45a761	Add a couple cast metal kernels. (#2479 )	2024-09-15 22:27:46 +02:00
Shengtuo Hu	ebf722b446	Export TensorIndexer public to candle users (#2477 )	2024-09-13 22:21:57 +02:00
Laurent Mazare	c09afc211c	Fix for metal tanh. (#2475 )	2024-09-13 07:08:36 +02:00
Laurent Mazare	b60faebea4	Missing metal kernels. (#2474 )	2024-09-12 13:58:50 +02:00
Laurent Mazare	72d649058b	Hook the MLX matmul kernels in candle-core. (#2473 )	2024-09-12 13:52:59 +02:00
Laurent Mazare	0cb0bd1dfa	Add some metal gemm benchark. (#2471 ) * Add some metal gemm benchark. * More benchmarks.	2024-09-11 22:52:37 +02:00
Laurent Mazare	afb6575835	Use the new MLX kernels to handle the BF16 matmul. (#2470 )	2024-09-11 17:34:05 +02:00
Laurent Mazare	5635650d38	Integrate the MLX gemm kernels (#2468 ) * Include the MLX gemm kernels. * Clippy lints. * Export the gemm_f32 kernel. * Add the f16/bf16 variants. * Add the initial dispatch code. * More plugging of the mlx kernels. * Add a currently broken test. * Tweaks. * Bugfix + get the tests to pass. * Enable the gemm bf16 tests. * Add some randomized tests. * Update candle-metal-kernels/src/lib.rs Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com> * More fixes. * More clippy fixes. --------- Co-authored-by: ivarflakstad <69173633+ivarflakstad@users.noreply.github.com>	2024-09-11 16:56:48 +02:00
hongmengning	13b2a8a4a0	Complete the missing backticks in the comments (#2469 )	2024-09-11 16:37:05 +02:00
Laurent Mazare	e3261216b1	Clippy fixes for 1.81.0. (#2461 ) * Clippy fixes for 1.81.0. * Another fix.	2024-09-05 23:46:55 +02:00
Eugene Hauptmann	c02b7c3272	Fix FLUX.1 weights (#2457 ) * fix FLUX.1 weights * added flux1-dev.safetensors	2024-08-29 17:10:28 +02:00
Jani Monoses	86613c00e2	MobileCLIP models S1 and S2 (#2454 ) * Allow loading images with given std and mean * OpenCLIP text encoder component * Two MobileCLIP models * Clippy fixes. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-08-29 15:38:58 +02:00
Jani Monoses	29e25c458d	FastViT fixes. (#2452 ) * correct optional SE layer dimensions. * head_dim instead of num_heads is 32. * update test example output.	2024-08-28 11:20:09 +02:00
Laurent Mazare	aafa24ed93	Update cudarc to 0.12. (#2451 ) * Update cudarc to 0.12. * Some cudnn tweaks.	2024-08-27 10:10:30 +02:00
ilookee	fdc2622686	fix: qwen2 lm_head loading #2443 (#2445 ) Co-authored-by: Yi Xu <xuyi@me.com>	2024-08-23 16:50:02 +02:00
Jani Monoses	ccdbe87639	Add FastViT model. (#2444 )	2024-08-23 16:06:54 +02:00
Laurent Mazare	2ec8729d51	Fix for parler-tts, do not add the last slice of padding tokens. (#2442 ) * Fix for parler-tts, do not add the last slice of padding tokens. * Support for the mini model.	2024-08-22 23:22:03 +02:00

... 2 3 4 5 6 ...

2339 Commits