mirror of
https://github.com/huggingface/candle.git
synced 2025-06-18 11:37:11 +00:00

* Support embedding model gte-Qwen1.5-7B-instruct This is a text embedding model based on Qwen2. They share same model architecture except the last MLP module. This commit brings in minimal modification of the old Qwen2 implementation to support both models. An example is provided, and had been verified according to the official PyTorch implementation. * Avoid doing the 'last-token filtering' based on the absence of attention mask. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>
20 lines
625 B
Markdown
20 lines
625 B
Markdown
# gte-Qwen1.5-7B-instruct
|
|
|
|
gte-Qwen1.5-7B-instruct is a variant of the GTE embedding model family.
|
|
|
|
- [Model card](https://huggingface.co/Alibaba-NLP/gte-Qwen1.5-7B-instruct) on the HuggingFace Hub.
|
|
- [Technical report](https://arxiv.org/abs/2308.03281) *Towards General Text Embeddings with Multi-stage Contrastive Learning*
|
|
|
|
|
|
## Running the example
|
|
|
|
Automatically download the model from the HuggingFace hub:
|
|
```bash
|
|
$ cargo run --example gte-qwen --release
|
|
```
|
|
|
|
or, load the model from a local directory:
|
|
```bash
|
|
cargo run --example gte-qwen --release --features cuda -- --local-repo /path/to/gte_Qwen1.5-7B-instruct/
|
|
```
|