mirror of
https://github.com/huggingface/candle.git
synced 2025-06-20 04:00:28 +00:00
Module Docs (#2624)
* update whisper * update llama2c * update t5 * update phi and t5 * add a blip model * qlamma doc * add two new docs * add docs and emoji * additional models * openclip * pixtral * edits on the model docs * update yu * update a fe wmore models * add persimmon * add model-level doc * names * update module doc * links in heira * remove empty URL * update more hyperlinks * updated hyperlinks * more links * Update mod.rs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
This commit is contained in:
@ -1,13 +1,12 @@
|
||||
//! The LLaVA (Large Language and Vision Assistant) model.
|
||||
//!
|
||||
//! This provides the main model implementation combining a vision tower (CLIP) with
|
||||
//! language model (Llama) for multimodal capabilities.
|
||||
//! language model (Llama) for multimodal capabilities. The architecture implements the training-free projection technique.
|
||||
//!
|
||||
//! The architecture implements the training-free projection technique from the paper:
|
||||
//! [Visual Instruction Tuning](https://arxiv.org/abs/2304.08485).
|
||||
//!
|
||||
//! - [GH Link](https://github.com/haotian-liu/LLaVA/tree/main)
|
||||
//! - 💻[GH Link](https://github.com/haotian-liu/LLaVA/tree/main)
|
||||
//! - 📝 [Paper](https://arxiv.org/abs/2304.08485)/ Visual Instruction Tuning
|
||||
//!
|
||||
|
||||
pub mod config;
|
||||
pub mod utils;
|
||||
|
||||
|
Reference in New Issue
Block a user