Mention TrOCR in the readmes. (#1691)

2025-06-15 18:28:24 +00:00 · 2024-02-10 15:49:38 +01:00
parent bf20cc854c
commit 27ffd644a9
2 changed files with 10 additions and 2 deletions
--- a/README.md
+++ b/README.md
@ -112,9 +112,10 @@ We also provide a some command line based examples using state of the art models
  evaluation, segmentation).
 - [VGG](./candle-examples/examples/vgg/),
  [RepVGG](./candle-examples/examples/repvgg): computer vision models.
- [BLIP](./candle-examples/examples/blip/): image to text model, can be used to
 - [BLIP](./candle-examples/examples/blip/): image to text model, can be used to
  generate captions for an image.
+- [TrOCR](./candle-examples/examples/trocr/): a transformer OCR model, with
+  dedicated submodels for hand-writing and printed recognition.
 - [Marian-MT](./candle-examples/examples/marian-mt/): neural machine translation
  model, generates the translated text from the input text.

@ -207,6 +208,7 @@ If you have an addition to this list, please submit a pull request.
        - Wurstchen v2.
    - Image to text.
        - BLIP.
+        - TrOCR.
    - Computer Vision Models.
        - DINOv2, ConvMixer, EfficientNet, ResNet, ViT, VGG, RepVGG, ConvNeXT.
        - yolo-v3, yolo-v8.
--- a/candle-examples/examples/trocr/readme.md
+++ b/candle-examples/examples/trocr/readme.md
@ -5,10 +5,16 @@ transcribe image text. See the associated [model
 card](https://huggingface.co/microsoft/trocr-base-printed) for details on
 the model itself.

+Supported models include:
+- `--which base`: small handwritten OCR model.
+- `--which large`: large handwritten OCR model.
+- `--which base-printed`: small printed OCR model.
+- `--which large-printed`: large printed OCR model.
+
 ## Running an example

 ```bash
-cargo run --example trocr --release --  --which base --cpu --image candle-examples/examples/trocr/assets/trocr.png
+cargo run --example trocr --release --  --image candle-examples/examples/trocr/assets/trocr.png
 ```

 ```