Add the flux model for image generation. (#2390)

* Add the flux autoencoder. * Add the encoder down-blocks. * Upsampling in the decoder. * Sketch the flow matching model. * More flux model. * Add some of the positional embeddings. * Add the rope embeddings. * Add the sampling functions. * Add the flux example. * Fix the T5 bits. * Proper T5 tokenizer. * Clip encoder path fix. * Get the clip embeddings. * No configurable weights in layer norm. * More weights related fixes. * Yet another shape fix. * DType fix. * Fix a couple more shape issues. * DType fixes. * Fix the latent dims. * Fix more shape issues. * Autoencoder fixes. * Get some generations out. * Bugfix. * T5 padding. * Clippy fix. * Add the decode only mode. * Fix. * More fixes. * Finally get some generations to work. * Add readme.
2025-06-20 20:09:50 +00:00 · 2024-08-04 07:14:33 +01:00
parent 0fcb40b229
commit 19db6b9723
8 changed files with 1346 additions and 0 deletions
--- a/candle-examples/examples/flux/README.md
+++ b/candle-examples/examples/flux/README.md
@ -0,0 +1,19 @@
+# candle-flux: image generation with latent rectified flow transformers
+
+![rusty robot holding a candle](./assets/flux-robot.jpg)
+
+Flux is a 12B rectified flow transformer capable of generating images from text
+descriptions,
+[huggingface](https://huggingface.co/black-forest-labs/FLUX.1-schnell),
+[github](https://github.com/black-forest-labs/flux),
+[blog post](https://blackforestlabs.ai/announcing-black-forest-labs/).
+
+
+## Running the model
+
+```bash
+cargo run --features cuda --example flux -r -- \
+    --height 1024 --width 1024
+    --prompt "a rusty robot walking on a beach holding a small torch, the robot has the word "rust" written on it, high quality, 4k"
+```
+