From e82fcf1c594b54c105f1a3979a09f3d2e044a2e0 Mon Sep 17 00:00:00 2001
From: Laurent Mazare <laurent.mazare@gmail.com>
Date: Tue, 12 Sep 2023 18:21:24 +0200
Subject: [PATCH] Add more example readmes. (#828)

* Add more readmes.

* Add a readme for dinov2.

* Add some skeleton files for a couple more examples.

* More whisper details.
---
 candle-examples/examples/bert/README.md      | 44 ++++++++++++++++++++
 candle-examples/examples/bigcode/README.md   |  7 ++++
 candle-examples/examples/dinov2/README.md    | 19 +++++++++
 candle-examples/examples/falcon/README.md    |  3 ++
 candle-examples/examples/quantized/README.md |  2 +-
 candle-examples/examples/whisper/README.md   | 39 +++++++++++++++++
 6 files changed, 113 insertions(+), 1 deletion(-)
 create mode 100644 candle-examples/examples/bert/README.md
 create mode 100644 candle-examples/examples/bigcode/README.md
 create mode 100644 candle-examples/examples/dinov2/README.md
 create mode 100644 candle-examples/examples/falcon/README.md
 create mode 100644 candle-examples/examples/whisper/README.md

diff --git a/candle-examples/examples/bert/README.md b/candle-examples/examples/bert/README.md
new file mode 100644
index 00000000..82ca5f40
--- /dev/null
+++ b/candle-examples/examples/bert/README.md
@@ -0,0 +1,44 @@
+# candle-bert
+
+Bert is a general large language model. In this example it can be used for two
+different tasks:
+- Compute sentence embeddings for a prompt.
+- Compute similarities between a set of sentences.
+
+
+## Sentence embeddings
+
+Bert is used to compute the sentence embeddings for a prompt. The model weights
+are downloaded from the hub on the first run.
+
+```bash
+cargo run --example bert --release -- --prompt "Here is a test sentence"
+
+> [[[ 0.0798, -0.0665, -0.0247, ..., -0.1082, -0.1000, -0.2751],
+>   [ 0.4218,  0.2690,  0.2740, ...,  0.3889,  1.3503,  0.9908],
+>   [ 0.0466,  0.3041, -0.1143, ...,  0.4427,  0.6926, -0.1515],
+>   ...
+>   [ 0.3396,  0.4320, -0.4408, ...,  0.9212,  0.2331, -0.6777],
+>   [ 0.2789,  0.7539,  0.4306, ..., -0.0095,  0.3375, -1.7529],
+>   [ 0.6737,  0.7882,  0.0548, ...,  0.1836,  0.7299, -0.6617]]]
+> Tensor[[1, 7, 384], f32]
+```
+
+## Similarities
+
+In this example, Bert is used to compute the sentence embeddings for a set of
+sentences (hardcoded in the examples). Then cosine similarities are computed for
+each sentence pair and they are reported by decreasing values, hence the first
+reported pair contains the two sentences that have the highest similarity score.
+The sentence embeddings are computed using average pooling through all the
+sentence tokens, including some potential padding.
+
+```bash
+cargo run --example bert --release
+
+> score: 0.85 'The new movie is awesome' 'The new movie is so great'
+> score: 0.61 'The cat sits outside' 'The cat plays in the garden'
+> score: 0.52 'I love pasta' 'Do you like pizza?'
+> score: 0.23 'The new movie is awesome' 'Do you like pizza?'
+> score: 0.22 'I love pasta' 'The new movie is awesome'
+```
diff --git a/candle-examples/examples/bigcode/README.md b/candle-examples/examples/bigcode/README.md
new file mode 100644
index 00000000..0b593674
--- /dev/null
+++ b/candle-examples/examples/bigcode/README.md
@@ -0,0 +1,7 @@
+# candle-starcoder: code generation model
+
+StarCoder/BigCode is a LLM model specialized to code generation.
+
+```bash
+cargo run --example bigcode --release -- --prompt "fn fact(n: u64) -> u64 "
+```
diff --git a/candle-examples/examples/dinov2/README.md b/candle-examples/examples/dinov2/README.md
new file mode 100644
index 00000000..10d4ac1f
--- /dev/null
+++ b/candle-examples/examples/dinov2/README.md
@@ -0,0 +1,19 @@
+# candle-dinov2
+
+[DINOv2](https://github.com/facebookresearch/dinov2) is a computer vision model.
+In this example, it is used as an ImageNet classifier: the model returns the
+probability for the image to belong to each of the 1000 ImageNet categories.
+
+## Running some example
+
+```bash
+cargo run --example dinov2 --release -- --image candle-examples/examples/yolo-v8/assets/bike.jpg
+
+> mountain bike, all-terrain bike, off-roader: 43.67%
+> bicycle-built-for-two, tandem bicycle, tandem: 33.20%
+> crash helmet            : 13.23%
+> unicycle, monocycle     : 2.44%
+> maillot                 : 2.42%
+```
+
+![Leading group, Giro d'Italia 2021](../yolo-v8/assets/bike.jpg)
diff --git a/candle-examples/examples/falcon/README.md b/candle-examples/examples/falcon/README.md
new file mode 100644
index 00000000..267c78c2
--- /dev/null
+++ b/candle-examples/examples/falcon/README.md
@@ -0,0 +1,3 @@
+# candle-falcon
+
+Falcon is a general large language model.
diff --git a/candle-examples/examples/quantized/README.md b/candle-examples/examples/quantized/README.md
index f3159493..ee4f3420 100644
--- a/candle-examples/examples/quantized/README.md
+++ b/candle-examples/examples/quantized/README.md
@@ -24,7 +24,7 @@ cargo run --example quantized --release -- --prompt "The best thing about coding
 > The best thing about coding in rust is 1.) that I don’t need to worry about memory leaks, 2.) speed and 3.) my program will compile even on old machines.
 ```
 
-### Command-line flags
+## Command-line flags
 
 Run with `--help` to see all options.
 
diff --git a/candle-examples/examples/whisper/README.md b/candle-examples/examples/whisper/README.md
new file mode 100644
index 00000000..124cd182
--- /dev/null
+++ b/candle-examples/examples/whisper/README.md
@@ -0,0 +1,39 @@
+# candle-whisper: speech recognition
+
+An implementation of [OpenAI Whisper](https://github.com/openai/whisper) using
+candle. Whisper is a general purpose speech recognition model, it can be used to
+convert audio files (in the `.wav` format) to text. Supported features include
+language detection as well as multilingual speech recognition.
+
+## Running some example
+
+If no audio file is passed as input, a [sample
+file](https://huggingface.co/datasets/Narsil/candle-examples/resolve/main/samples_jfk.wav) is automatically downloaded
+from the hub.
+
+```bash
+ cargo run --example whisper --release
+
+> No audio file submitted: Downloading https://huggingface.co/datasets/Narsil/candle_demo/blob/main/samples_jfk.wav
+> loaded wav data: Header { audio_format: 1, channel_count: 1, sampling_rate: 16000, bytes_per_second: 32000, bytes_per_sample: 2, bits_per_sample: 16 }
+> pcm data loaded 176000
+> loaded mel: [1, 80, 3000]
+> 0.0s -- 30.0s:  And so my fellow Americans ask not what your country can do for you ask what you can do for your country
+ ```
+
+ In order to use the multilingual mode, specify a multilingual model via the
+ `--model` flag, see the details below.
+
+## Command line flags
+
+- `--input`: the audio file to be converted to text, in wav format.
+- `--language`: force the language to some specific value rather than being
+  detected, e.g. `en`.
+- `--task`: the task to be performed, can be `transcribe` (return the text data
+  in the original language) or `translate` (translate the text to English). 
+- `--timestamps`: enable the timestamp mode where some timestamps are reported
+  for each recognized audio extracts.
+- `--model`: the model to be used. Models that do not end with `-en` are
+  multilingual models, other ones are English only models. The supported models
+  are `tiny`, `tiny.en`, `base`, `base.en`, `small`, `small.en`, `medium`,
+  `medium.en`, `large`, and `large-v2`.