Add Pixtral. (#2521)

* Add Pixtral.

* More pixtral vision encoder.

* Sketch a pixtral example.

* Sketch a pixtral example.

* Better image loading.

* Support loading images embedded in safetensor files.

* Clippy fixes.

* Add the llava multimodal adapter.

* Add more of the llava bits.

* Add the pixtral config.

* More pixtral inference.

* Add the text generation bits.

* Get the example to work.

* Bugfix.

* Run some bits of the model in f32.

* Blessed version :)

* Better rope frequency computations.

* README update.
This commit is contained in:
Laurent Mazare
2024-09-30 19:31:14 +02:00
committed by GitHub
parent 2f49e1b534
commit 683ab698de
9 changed files with 822 additions and 19 deletions

View File

@ -279,7 +279,7 @@ impl LLaVA {
(),
))?
} else {
todo!("not implemented in original python LLaVA yet")
bail!("not implemented in original python LLaVA yet")
};
let new_image_feature = if mm_patch_merge_type.contains("unpad") {
let new_image_feature = new_image_feature