mirror of https://github.com/huggingface/candle.git synced 2025-06-15 18:28:24 +00:00

Files

shua 6056fd5c90 onnx: fix pad, unsqueeze (#2317 )

* onnx: fix pad, unsqueeze

both implementations have off-by-one errors:
- Pad 'reflect' cycle for eg `dim==3` is `[0,1,2,1]` which has length of
  4 (or `dim*2 - 2`) not 5 (current code `dim*2 - 1`)
- Unsqueeze(-1) for tensor with `dim==3` should be 3 (ie `dim+index+1`)
  not 2 (ie currently `dim+index`)

in addition, Pad is incorrectly calculating the starting padding.
If we want to pad out 2 elements to the start, and we have this cycle
of indices of length 6, then we should skip 4 elements, but currently
we skip 2. A more visual representation of what's going on is below:

```
pad_start: 2
data:      [a,b,c,d]
indices:   [0, 1, 2, 3, 2, 1, 0, 1, 2, 3, 2, 1, 0, ..] // zigzag between 0..4
actual:    skip [ c  d| c  b  a  b]
expected:  ~  skip  ~ [ c  b| a  b  c  d]
```

The values between `[` and `|` are padding and the values between
`|` and `]` in the example should match the original data being padded.

* Fix clippy lints.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>

2024-07-23 23:10:57 +02:00

assets

[segment-anything] Support multi-point as the prompt input (#945 )

2023-09-25 12:14:10 +01:00

main.rs

onnx: fix pad, unsqueeze (#2317 )

2024-07-23 23:10:57 +02:00

README.md

Use a single flag for the point argument. (#958 )

2023-09-25 12:53:24 +01:00

README.md

candle-segment-anything: Segment-Anything Model

This example is based on Meta AI Segment-Anything Model. This model provides a robust and fast image segmentation pipeline that can be tweaked via some prompting (requesting some points to be in the target mask, requesting some points to be part of the background so not in the target mask, specifying some bounding box).

The default backbone can be replaced by the smaller and faster TinyViT model based on MobileSAM.

Running some example.

cargo run --example segment-anything --release -- \
    --image candle-examples/examples/yolo-v8/assets/bike.jpg
    --use-tiny
    --point 0.6,0.6 --point 0.6,0.55

Running this command generates a sam_merged.jpg file containing the original image with a blue overlay of the selected mask. The red dots represent the prompt specified by --point 0.6,0.6 --point 0.6,0.55, this prompt is assumed to be part of the target mask.

The values used for --point should be a comma delimited pair of float values. They are proportional to the image dimension, i.e. use 0.5 for the image center.

Original image:

Segment results by prompting with a single point --point 0.6,0.55:

Segment results by prompting with multiple points --point 0.6,0.6 --point 0.6,0.55:

Command-line flags

--use-tiny: use the TinyViT based MobileSAM backbone rather than the default one.
--point: specifies the location of the target points.
--threshold: sets the threshold value to be part of the mask, a negative value results in a larger mask and can be specified via --threshold=-1.2.