candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-15 10:26:33 +00:00

Author	SHA1	Message	Date
Mike Seddon	c930ab7e1a	upgrade half library to fix rand (#2806 ) fix lints	2025-03-14 09:01:54 +01:00
mneilly	3164a19a5d	Add inpainting to the stable diffusion example (#2735 ) * Update the stable diffusion example with inpainting support for 1.5, 2 and XL. * Apply cargo fmt. * Clippy fixes. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2025-01-23 10:08:38 +01:00
Nick Senger	cbaa0ad46f	UniPC for diffusion sampling (#2684 ) * feat: Add unipc multistep scheduler * chore: Clippy and formatting * chore: Update comments * chore: Avoid unsafety in float ordering * refactor: Update Scheduler::step mutability requirements * fix: Corrector img2img * chore: Update unipc ref link to latest diffusers release * chore: Deduplicate float ordering * fix: Panic when running with dev profile	2025-01-01 21:34:17 +01:00
shua	6056fd5c90	onnx: fix pad, unsqueeze (#2317 ) * onnx: fix pad, unsqueeze both implementations have off-by-one errors: - Pad 'reflect' cycle for eg `dim==3` is `[0,1,2,1]` which has length of 4 (or `dim2 - 2`) not 5 (current code `dim2 - 1`) - Unsqueeze(-1) for tensor with `dim==3` should be 3 (ie `dim+index+1`) not 2 (ie currently `dim+index`) in addition, Pad is incorrectly calculating the starting padding. If we want to pad out 2 elements to the start, and we have this cycle of indices of length 6, then we should skip 4 elements, but currently we skip 2. A more visual representation of what's going on is below: ``` pad_start: 2 data: [a,b,c,d] indices: [0, 1, 2, 3, 2, 1, 0, 1, 2, 3, 2, 1, 0, ..] // zigzag between 0..4 actual: skip [ c d\| c b a b] expected: ~ skip ~ [ c b\| a b c d] ``` The values between `[` and `\|` are padding and the values between `\|` and `]` in the example should match the original data being padded. * Fix clippy lints. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-07-23 23:10:57 +02:00
NorilskMajor	4d14777673	Utilize batches in Stable Diffusion (#2071 ) * Utilize batches in Stable Diffusion that were already there, but unutilized. Also refactor out the `save_image` function. * Clippy + cosmetic fixes. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-04-16 06:49:04 +02:00
Laurent Mazare	a00e24d752	Improve the error message on overlong prompts. (#1908 )	2024-03-21 21:08:07 +01:00
Niklas Hallqvist	0a3487a776	Add a --seed argument to the stable-diffusion example. (#1812 ) * Add a --seed argument to the stable-diffusion example. * Make the case when no seed is specified, that it will not be set, but use the engine's default. This will make the CPU engine work again when no --seed is given, and will cause a bailout when a seed is there, as the engine does not currently support it. --------- Co-authored-by: niklas <niklas@appli.se>	2024-03-08 08:17:36 +01:00
Daniel Varga	32eb56d6b3	Fix typo in README (#1740 )	2024-02-22 12:35:26 +01:00
stano	03ce8caf40	Format properly the Stable Diffusion example run with params (#1511 ) Move out the --sd-version flag out of the prompt.	2024-01-01 11:13:35 +01:00
Laurent Mazare	0738df5290	Add more mentions to SDXL Turbo in the readme. (#1397 )	2023-12-03 10:41:21 +00:00
Edwin Cheng	37bf1ed012	Stable Diffusion Turbo Support (#1395 ) * Add support for SD Turbo * Set Leading as default in euler_ancestral discrete * Use the appropriate default values for n_steps and guidance_scale. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2023-12-03 08:37:10 +01:00
drbh	92a05b51cf	fix: address clippy 0.1.74 issues (#1336 ) - clippy::needless-borrows-for-generic-args - clippy::reserve-after-initialization	2023-11-16 21:15:22 +00:00
Laurent Mazare	25c3cc4149	Mention the flash-attention restriction in the readme. (#1158 )	2023-10-23 10:26:56 +01:00
Laurent Mazare	b34d7f0248	Remove some unusued bits. (#1067 )	2023-10-09 19:49:57 +01:00
Laurent Mazare	4d04ac83c7	Override the repo for SDXL f16 vae weights. (#1064 ) * Override the repo for SDXL f16 vae weights. * Slightly simpler change.	2023-10-09 06:52:28 +01:00
Laurent Mazare	716883e9b0	Add the clamping for stable-diffusion. (#1041 )	2023-10-05 22:20:39 +01:00
Laurent Mazare	c5a058b169	Use the module trait in stable-diffusion. (#817 )	2023-09-11 20:40:07 +01:00
Laurent Mazare	5c35fbbb13	Stable-Diffusion readme (#814 ) * Stable Diffusion readme. * Fix the image path. * Move the assets. * Resize the sample image. * Lower resolution.	2023-09-11 13:06:29 +01:00
Laurent Mazare	d7b9fec849	Move the stable-diffusion modeling code so that it's easier to re-use. (#812 )	2023-09-11 11:45:57 +01:00
Laurent Mazare	dcf708559d	Fix for cudnn to work with img2img. (#753 )	2023-09-06 07:49:28 +01:00
Laurent Mazare	7299a68353	img2img pipeline for stable diffusion. (#752 ) * img2img pipeline for stable diffusion. * Rename the arguments + fix. * Fix for zero strength. * Another fix. * Another fix. * Revert. * Include the backtrace. * Noise scaling. * Fix the height/width.	2023-09-06 07:06:49 +01:00
Laurent Mazare	1c9e5394a5	Add a custom softmax implementation. (#744 ) * Add a custom softmax implementation. * Add softmaxlastdim to the benchmarks. * And add a test. * Support more dtypes. * Polish the code. * Use the slow implementation on cuda. * Add a todo for the cuda kernel.	2023-09-05 14:20:23 +01:00
Laurent Mazare	2d3fcad267	Simplify usage of the pool functions. (#662 ) * Simplify usage of the pool functions. * Small tweak. * Attempt at using apply to simplify the convnet definition.	2023-08-29 19:12:16 +01:00
Laurent Mazare	a044907ffc	Dilated convolutions (#657 ) * Add the dilation parameter. * Restore the basic optimizer example. * Dilation support in cudnn. * Use the dilation parameter in the cpu backend. * More dilation support. * No support for dilation in transposed convolutions. * Add dilation to a test. * Remove a print. * Helper function.	2023-08-29 16:12:11 +01:00
Laurent Mazare	62ef494dc1	Use multiple transformer layer in the same cross-attn blocks. (#653 ) * Use multiple transformer layer in the same cross-attn blocks. * Make the context contiguous if required.	2023-08-29 11:13:43 +01:00
Laurent Mazare	33c23c19b6	Preliminary support for SDXL. (#647 ) * Preliminary support for SDXL. * More SDXL support. * More SDXL. * Use the proper clip config. * Querying for existing tensors. * More robust test.	2023-08-29 09:00:04 +01:00
Laurent Mazare	72ebb12bca	Remove some dead-code annotations. (#629 ) * Remove some dead-code annotations. * More dead code removal. * One more. * CI fix.	2023-08-27 18:52:33 +01:00
Laurent Mazare	329f661d9b	Trace softmax (#568 ) * Trace the softmax op. * Inline the sum. * Add min/max vec operations.	2023-08-23 15:25:50 +01:00
Laurent Mazare	aba1e90797	Add some group parameter to convolutions. (#566 ) * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.	2023-08-23 12:58:55 +01:00
Laurent Mazare	aa207f2dd9	Print some per-step timings in stable-diffusion. (#520 ) * Skeleton files for neon support of quantization. * SIMD version for q4 vecdot. * Also simdify the q6k multiplication. * Add some timings to stable-diffusion.	2023-08-20 05:45:12 +01:00
Laurent Mazare	4f1541526c	dinov2 - read images from disk and compute the class probabilities (#503 ) * Load the image from disk and convert it to a tensor. * Tweak the function name.	2023-08-18 15:50:33 +01:00
Laurent Mazare	c78ce76501	Add a simple Module trait and implement it for the various nn layers (#500 ) * Start adding the module trait. * Use the module trait. * Implement module for qmatmul.	2023-08-18 09:38:22 +01:00
Laurent Mazare	5d99026fd2	F16 support for stable diffusion (#488 ) * F16 support for stable diffusion. * Keep the attention bits in F32. * Keep more of the attention bits in F32. * More mixed precision support.	2023-08-17 13:48:56 +01:00
Laurent Mazare	c3176f0dfb	Flash-attention support in stable diffusion (#487 ) * Add flash-attention for the stable-diffusion example. * Change the dtype. * Silly fix. * Another fix. * Revert the dtype back to the query dtype after apply flash-attn.	2023-08-17 12:16:40 +01:00
Laurent Mazare	9af438ac1b	Track the conv2d operations in stable-diffusion. (#431 ) * Track the conv2d operations in stable-diffusion. * Add more tracing to stable-diffusion. * Also trace the resnet bits. * Trace the attention blocks. * Also trace the attention inner part. * Small tweak.	2023-08-13 15:58:26 +01:00
Laurent Mazare	b1ff78f762	Allow using accelerate with stable-diffusion. (#430 )	2023-08-13 14:14:20 +01:00
Laurent Mazare	1d0157bbc4	Stable diffusion: retrieve the model files from the HF hub. (#414 ) * Retrieve the model files from the HF hub in the stable diffusion example. * Add to the readme.	2023-08-11 18:57:06 +01:00
Laurent Mazare	80f0482f26	Fix the stable-diffusion vae. (#398 ) * Fix the stable-diffusion vae. * Fix for saving images.	2023-08-10 18:24:31 +01:00
Laurent Mazare	3a62aee91f	Write the generated images using the image crate. (#363 ) * Use the image crate to write the generated images. * Make the dependency optional.	2023-08-09 15:26:44 +01:00
Laurent Mazare	be21d7e75a	Fix the padding used in stable diffusion. (#362 )	2023-08-09 13:23:59 +01:00
Laurent Mazare	89d3926c9b	Fixes for the stable diffusion example. (#342 ) * Fixes for the stable diffusion example. * Bugfix. * Another fix. * Fix for group-norm. * More fixes to get SD to work.	2023-08-08 14:57:09 +01:00
Laurent Mazare	fc265d9dcf	Some CLIP fixes for stable diffusion. (#338 ) * Some CLIP fixes for stable diffusion. * Add the avg-pool2d operation on cpu.	2023-08-07 18:31:45 +01:00
Laurent Mazare	2345b8ce3f	Skeleton for the avg-pool2d and upsample-nearest2d ops. (#337 ) * Skeleton for the avg-pool2d and upsample-nearest2d ops. * Preliminary conv2d support.	2023-08-07 16:15:38 +01:00
Laurent Mazare	f53a333ea9	Simple pad support. (#336 ) * Simple pad support. * Fix the tensor indexing when padding.	2023-08-07 15:24:56 +01:00
Laurent Mazare	5bb2fce998	Implement group-norm. (#334 ) * Implement group-norm. * Add some testing for group-norm.	2023-08-07 06:53:05 +01:00
Laurent Mazare	141df4ad2b	Main diffusion loop for the SD example. (#332 )	2023-08-06 21:39:53 +01:00
Laurent Mazare	166bfd5847	Add the recip op + use it in stable-diffusion. (#331 ) * Add the recip unary op. * Fix the cuda kernel. * Use the recip op in sigmoid.	2023-08-06 21:14:52 +01:00
Laurent Mazare	1c062bf06b	Add the ddim scheduler. (#330 )	2023-08-06 20:44:00 +01:00
Laurent Mazare	d34039e352	Add a stable diffusion example (#328 ) * Start adding a stable-diffusion example. * Proper computation of the causal mask. * Add the chunk operation. * Work in progress: port the attention module. * Add some dummy modules for conv2d and group-norm, get the attention module to compile. * Re-enable the 2d convolution. * Add the embeddings module. * Add the resnet module. * Add the unet blocks. * Add the unet. * And add the variational auto-encoder. * Use the pad function from utils.	2023-08-06 17:49:43 +01:00

49 Commits