mirror of
https://github.com/huggingface/candle.git
synced 2025-06-21 04:10:46 +00:00
Add the SigLIP model. (#2515)
* Add the SigLIP model. * Add more to the forward pass of the vision model. * Complete the forward pass. * Add the siglip example. * Fix. * Another fix. * Get everything in place. * Add a readme.
This commit is contained in:
24
candle-examples/examples/siglip/README.md
Normal file
24
candle-examples/examples/siglip/README.md
Normal file
@ -0,0 +1,24 @@
|
||||
## SigLIP
|
||||
|
||||
SigLIP is multi-modal text-vision model that improves over CLIP by using a sigmoid based loss,
|
||||
[HuggingFace](https://huggingface.co/google/siglip-base-patch16-224).
|
||||
|
||||
### Running an example
|
||||
```
|
||||
$ cargo run --features cuda -r --example siglip -
|
||||
softmax_image_vec: [2.1912122e-14, 2.3624872e-14, 1.0, 1.0, 2.4787932e-8, 3.2784535e-12]
|
||||
|
||||
|
||||
Results for image: candle-examples/examples/stable-diffusion/assets/stable-diffusion-xl.jpg
|
||||
|
||||
Probability: 0.0000% Text: a cycling race
|
||||
Probability: 0.0000% Text: a photo of two cats
|
||||
Probability: 100.0000% Text: a robot holding a candle
|
||||
|
||||
|
||||
Results for image: candle-examples/examples/yolo-v8/assets/bike.jpg
|
||||
|
||||
Probability: 100.0000% Text: a cycling race
|
||||
Probability: 0.0000% Text: a photo of two cats
|
||||
Probability: 0.0000% Text: a robot holding a candle
|
||||
```
|
Reference in New Issue
Block a user