Add Pixtral. (#2521)

* Add Pixtral. * More pixtral vision encoder. * Sketch a pixtral example. * Sketch a pixtral example. * Better image loading. * Support loading images embedded in safetensor files. * Clippy fixes. * Add the llava multimodal adapter. * Add more of the llava bits. * Add the pixtral config. * More pixtral inference. * Add the text generation bits. * Get the example to work. * Bugfix. * Run some bits of the model in f32. * Blessed version :) * Better rope frequency computations. * README update.
2025-06-20 04:00:28 +00:00 · 2024-09-30 19:31:14 +02:00
parent 2f49e1b534
commit 683ab698de
9 changed files with 822 additions and 19 deletions
--- a/candle-transformers/src/models/llava/mod.rs
+++ b/candle-transformers/src/models/llava/mod.rs
@ -279,7 +279,7 @@ impl LLaVA {
                            (),
                        ))?
                    } else {
-                        todo!("not implemented in original python LLaVA yet")
+                        bail!("not implemented in original python LLaVA yet")
                    };
                    let new_image_feature = if mm_patch_merge_type.contains("unpad") {
                        let new_image_feature = new_image_feature