Quantized version of flux. (#2500)

* Quantized version of flux.

* More generic sampling.

* Hook the quantized model.

* Use the newly minted gguf file.

* Fix for the quantized model.

* Default to avoid the faster cuda kernels.
This commit is contained in:
Laurent Mazare
2024-09-26 10:23:43 +02:00
committed by GitHub
parent d01207dbf3
commit 10d47183c0
6 changed files with 555 additions and 26 deletions

View File

@ -13,7 +13,7 @@ descriptions,
```bash
cargo run --features cuda --example flux -r -- \
--height 1024 --width 1024
--height 1024 --width 1024 \
--prompt "a rusty robot walking on a beach holding a small torch, the robot has the word "rust" written on it, high quality, 4k"
```