diff --git a/README.md b/README.md index eb7c189b..8c076ec7 100644 --- a/README.md +++ b/README.md @@ -56,7 +56,7 @@ These online demos run entirely in your browser: - [T5](https://huggingface.co/spaces/radames/Candle-T5-Generation-Wasm): text generation. - [Phi-v1.5](https://huggingface.co/spaces/radames/Candle-Phi-1.5-Wasm): text generation. - [Segment Anything Model](https://huggingface.co/spaces/radames/candle-segment-anything-wasm): Image segmentation. -- [Blip](https://huggingface.co/spaces/radames/Candle-BLIP-Image-Captioning): image captioning. +- [BLIP](https://huggingface.co/spaces/radames/Candle-BLIP-Image-Captioning): image captioning. We also provide a some command line based examples using state of the art models: @@ -96,7 +96,8 @@ We also provide a some command line based examples using state of the art models - [Whisper](./candle-examples/examples/whisper/): speech recognition model. -- [T5](./candle-examples/examples/t5), [Bert](./candle-examples/examples/bert/): useful for sentence embeddings. +- [T5](./candle-examples/examples/t5), [Bert](./candle-examples/examples/bert/), + [JinaBert](./candle-examples/examples/jina-bert/) : useful for sentence embeddings. - [DINOv2](./candle-examples/examples/dinov2/): computer vision model trained using self-supervision (can be used for imagenet classification, depth evaluation, segmentation). diff --git a/candle-examples/examples/jina-bert/README.md b/candle-examples/examples/jina-bert/README.md new file mode 100644 index 00000000..02afbaa9 --- /dev/null +++ b/candle-examples/examples/jina-bert/README.md @@ -0,0 +1,45 @@ +# candle-jina-bert + +Jina-Bert is a general large language model with a context size of 8192, [model +card](https://huggingface.co/jinaai/jina-embeddings-v2-base-en). In this example +it can be used for two different tasks: +- Compute sentence embeddings for a prompt. +- Compute similarities between a set of sentences. + + +## Sentence embeddings + +Jina-Bert is used to compute the sentence embeddings for a prompt. The model weights +are downloaded from the hub on the first run. + +```bash +cargo run --example jina-bert --release -- --prompt "Here is a test sentence" + +> [[[ 0.1595, -0.9885, 0.6494, ..., 0.3003, -0.6901, -1.2355], +> [ 0.0374, -0.1798, 1.3359, ..., 0.6731, 0.2133, -1.6807], +> [ 0.1700, -0.8534, 0.8924, ..., -0.1785, -0.0727, -1.5087], +> ... +> [-0.3113, -1.3665, 0.2027, ..., -0.2519, 0.1711, -1.5811], +> [ 0.0907, -1.0492, 0.5382, ..., 0.0242, -0.7077, -1.0830], +> [ 0.0369, -0.6343, 0.6105, ..., 0.0671, 0.3778, -1.1505]]] +> Tensor[[1, 7, 768], f32] +``` + +## Similarities + +In this example, Jina-Bert is used to compute the sentence embeddings for a set of +sentences (hardcoded in the examples). Then cosine similarities are computed for +each sentence pair and they are reported by decreasing values, hence the first +reported pair contains the two sentences that have the highest similarity score. +The sentence embeddings are computed using average pooling through all the +sentence tokens, including some potential padding. + +```bash +cargo run --example jina-bert --release + +> score: 0.94 'The new movie is awesome' 'The new movie is so great' +> score: 0.81 'The cat sits outside' 'The cat plays in the garden' +> score: 0.78 'I love pasta' 'Do you like pizza?' +> score: 0.68 'I love pasta' 'The new movie is awesome' +> score: 0.67 'A man is playing guitar' 'A woman watches TV' +``` diff --git a/candle-examples/examples/jina-bert/main.rs b/candle-examples/examples/jina-bert/main.rs index ffde777d..d959d4cb 100644 --- a/candle-examples/examples/jina-bert/main.rs +++ b/candle-examples/examples/jina-bert/main.rs @@ -35,19 +35,37 @@ struct Args { normalize_embeddings: bool, #[arg(long)] - tokenizer: String, + tokenizer: Option, #[arg(long)] - model: String, + model: Option, } impl Args { fn build_model_and_tokenizer(&self) -> anyhow::Result<(BertModel, tokenizers::Tokenizer)> { + use hf_hub::{api::sync::Api, Repo, RepoType}; + let model = match &self.model { + Some(model_file) => std::path::PathBuf::from(model_file), + None => Api::new()? + .repo(Repo::new( + "jinaai/jina-embeddings-v2-base-en".to_string(), + RepoType::Model, + )) + .get("model.safetensors")?, + }; + let tokenizer = match &self.tokenizer { + Some(file) => std::path::PathBuf::from(file), + None => Api::new()? + .repo(Repo::new( + "sentence-transformers/all-MiniLM-L6-v2".to_string(), + RepoType::Model, + )) + .get("tokenizer.json")?, + }; let device = candle_examples::device(self.cpu)?; let config = Config::v2_base(); - let tokenizer = tokenizers::Tokenizer::from_file(&self.tokenizer).map_err(E::msg)?; - let vb = - unsafe { VarBuilder::from_mmaped_safetensors(&[&self.model], DType::F32, &device)? }; + let tokenizer = tokenizers::Tokenizer::from_file(tokenizer).map_err(E::msg)?; + let vb = unsafe { VarBuilder::from_mmaped_safetensors(&[model], DType::F32, &device)? }; let model = BertModel::new(vb, &config)?; Ok((model, tokenizer)) }