mirror of
https://github.com/huggingface/candle.git
synced 2025-06-15 18:28:24 +00:00

* first commit * llava * clippy and fmt * some fixes * minor fixes * remove useless file * refactor: Remove llava/constants.rs and update llava/mod.rs * modify variable name * modify code after clippy * Minor tweaks. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>
1.8 KiB
1.8 KiB
candle-llava
LLaVA (Large Language-and-Vision Assistant) is an end-to-end trained large multimodal model. This example is from candle-llava
The code is based on https://github.com/haotian-liu/LLaVA, Hence the llava-hf version of config may perform differently.
model zoo
Right now this has been tested on liuhaotian/llava-v1.6-vicuna-7b
and
llava-hf/llava-v1.6-vicuna-7b-hf
. Memory usage might have room for optimization.
Tokenizer Setup
The llava-hf models contain a tokenizer.json
file so can be used directly with
the -hf
command line flag.
For the original llava models, you can use the following code to generate the tokenizer.json
file.
conda create -n llava python=3.10
pip install transformers protobuf
conda activate llava
python -c "from transformers import AutoTokenizer;tokenizer=AutoTokenizer.from_pretrained('liuhaotian/llava-v1.6-vicuna-7b');tokenizer.save_pretrained('tokenizer')"
Then the tokenizer.json
file should be in tokenizer/tokenizer.json
(which is the default path).
eval
cargo run --example llava --features cuda -- --image-file "llava_logo.png" --prompt "is this a cat?" --hf # default args, use llava-hf/llava-v1.6-vicuna-7b-hf. image-file is required^_^
cargo run --example llava --features cuda -- --model-path liuhaotian/llava-v1.6-vicuna-7b --image-file "llava_logo.png" --prompt "is this a cat?" # use liuhaotian/llava-v1.6-vicuna-7b, tokenizer setup should be done
Major Limitations
- Currently only support llama-2/vicuna llm. Haven't supoort Mistral yet.
- There are some ops like split, nonzero and where are not supported by candle.
- Lack of quantization and LoRA support.