Files
candle/candle-examples/examples/yolo-v8
shua 6056fd5c90 onnx: fix pad, unsqueeze (#2317)
* onnx: fix pad, unsqueeze

both implementations have off-by-one errors:
- Pad 'reflect' cycle for eg `dim==3` is `[0,1,2,1]` which has length of
  4 (or `dim*2 - 2`) not 5 (current code `dim*2 - 1`)
- Unsqueeze(-1) for tensor with `dim==3` should be 3 (ie `dim+index+1`)
  not 2 (ie currently `dim+index`)

in addition, Pad is incorrectly calculating the starting padding.
If we want to pad out 2 elements to the start, and we have this cycle
of indices of length 6, then we should skip 4 elements, but currently
we skip 2. A more visual representation of what's going on is below:

```
pad_start: 2
data:      [a,b,c,d]
indices:   [0, 1, 2, 3, 2, 1, 0, 1, 2, 3, 2, 1, 0, ..] // zigzag between 0..4
actual:    skip [ c  d| c  b  a  b]
expected:  ~  skip  ~ [ c  b| a  b  c  d]
```

The values between `[` and `|` are padding and the values between
`|` and `]` in the example should match the original data being padded.

* Fix clippy lints.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
2024-07-23 23:10:57 +02:00
..
2024-07-23 23:10:57 +02:00

candle-yolo-v8: Object Detection and Pose Estimation

This is a port of Ultralytics YOLOv8. The implementation is based on the tinygrad version and on the model architecture described in this issue. The supported tasks are object detection and pose estimation.

You can try this model online on the Candle YOLOv8 Space. The model then fully runs in your browser using WebAssembly - if you use a custom image it will never leave your phone/computer!

Running some example

Object Detection

cargo run --example yolo-v8 --release -- candle-examples/examples/yolo-v8/assets/bike.jpg

This prints details about the detected objects and generates a bike.pp.jpg file.

Leading group, Giro d'Italia 2021

Image source: wikimedia.

Leading group, Giro d'Italia 2021

Pose Estimation

cargo run --example yolo-v8 --release -- \
  candle-examples/examples/yolo-v8/assets/bike.jpg --task pose

Leading group, Giro d'Italia 2021

Command-line flags

  • --which: select the model variant to be used, n, s , m, l, or x by increasing size and quality.
  • --task: detect for object detection and pose for pose estimation.
  • --legend-size: the size of the characters to print.
  • --model: use a local model file rather than downloading it from the hub.