Add more example readmes. (#828)

* Add more readmes.

* Add a readme for dinov2.

* Add some skeleton files for a couple more examples.

* More whisper details.
This commit is contained in:
Laurent Mazare
2023-09-12 18:21:24 +02:00
committed by GitHub
parent 805bf9ffa7
commit e82fcf1c59
6 changed files with 113 additions and 1 deletions

View File

@ -0,0 +1,44 @@
# candle-bert
Bert is a general large language model. In this example it can be used for two
different tasks:
- Compute sentence embeddings for a prompt.
- Compute similarities between a set of sentences.
## Sentence embeddings
Bert is used to compute the sentence embeddings for a prompt. The model weights
are downloaded from the hub on the first run.
```bash
cargo run --example bert --release -- --prompt "Here is a test sentence"
> [[[ 0.0798, -0.0665, -0.0247, ..., -0.1082, -0.1000, -0.2751],
> [ 0.4218, 0.2690, 0.2740, ..., 0.3889, 1.3503, 0.9908],
> [ 0.0466, 0.3041, -0.1143, ..., 0.4427, 0.6926, -0.1515],
> ...
> [ 0.3396, 0.4320, -0.4408, ..., 0.9212, 0.2331, -0.6777],
> [ 0.2789, 0.7539, 0.4306, ..., -0.0095, 0.3375, -1.7529],
> [ 0.6737, 0.7882, 0.0548, ..., 0.1836, 0.7299, -0.6617]]]
> Tensor[[1, 7, 384], f32]
```
## Similarities
In this example, Bert is used to compute the sentence embeddings for a set of
sentences (hardcoded in the examples). Then cosine similarities are computed for
each sentence pair and they are reported by decreasing values, hence the first
reported pair contains the two sentences that have the highest similarity score.
The sentence embeddings are computed using average pooling through all the
sentence tokens, including some potential padding.
```bash
cargo run --example bert --release
> score: 0.85 'The new movie is awesome' 'The new movie is so great'
> score: 0.61 'The cat sits outside' 'The cat plays in the garden'
> score: 0.52 'I love pasta' 'Do you like pizza?'
> score: 0.23 'The new movie is awesome' 'Do you like pizza?'
> score: 0.22 'I love pasta' 'The new movie is awesome'
```

View File

@ -0,0 +1,7 @@
# candle-starcoder: code generation model
StarCoder/BigCode is a LLM model specialized to code generation.
```bash
cargo run --example bigcode --release -- --prompt "fn fact(n: u64) -> u64 "
```

View File

@ -0,0 +1,19 @@
# candle-dinov2
[DINOv2](https://github.com/facebookresearch/dinov2) is a computer vision model.
In this example, it is used as an ImageNet classifier: the model returns the
probability for the image to belong to each of the 1000 ImageNet categories.
## Running some example
```bash
cargo run --example dinov2 --release -- --image candle-examples/examples/yolo-v8/assets/bike.jpg
> mountain bike, all-terrain bike, off-roader: 43.67%
> bicycle-built-for-two, tandem bicycle, tandem: 33.20%
> crash helmet : 13.23%
> unicycle, monocycle : 2.44%
> maillot : 2.42%
```
![Leading group, Giro d'Italia 2021](../yolo-v8/assets/bike.jpg)

View File

@ -0,0 +1,3 @@
# candle-falcon
Falcon is a general large language model.

View File

@ -24,7 +24,7 @@ cargo run --example quantized --release -- --prompt "The best thing about coding
> The best thing about coding in rust is 1.) that I dont need to worry about memory leaks, 2.) speed and 3.) my program will compile even on old machines. > The best thing about coding in rust is 1.) that I dont need to worry about memory leaks, 2.) speed and 3.) my program will compile even on old machines.
``` ```
### Command-line flags ## Command-line flags
Run with `--help` to see all options. Run with `--help` to see all options.

View File

@ -0,0 +1,39 @@
# candle-whisper: speech recognition
An implementation of [OpenAI Whisper](https://github.com/openai/whisper) using
candle. Whisper is a general purpose speech recognition model, it can be used to
convert audio files (in the `.wav` format) to text. Supported features include
language detection as well as multilingual speech recognition.
## Running some example
If no audio file is passed as input, a [sample
file](https://huggingface.co/datasets/Narsil/candle-examples/resolve/main/samples_jfk.wav) is automatically downloaded
from the hub.
```bash
cargo run --example whisper --release
> No audio file submitted: Downloading https://huggingface.co/datasets/Narsil/candle_demo/blob/main/samples_jfk.wav
> loaded wav data: Header { audio_format: 1, channel_count: 1, sampling_rate: 16000, bytes_per_second: 32000, bytes_per_sample: 2, bits_per_sample: 16 }
> pcm data loaded 176000
> loaded mel: [1, 80, 3000]
> 0.0s -- 30.0s: And so my fellow Americans ask not what your country can do for you ask what you can do for your country
```
In order to use the multilingual mode, specify a multilingual model via the
`--model` flag, see the details below.
## Command line flags
- `--input`: the audio file to be converted to text, in wav format.
- `--language`: force the language to some specific value rather than being
detected, e.g. `en`.
- `--task`: the task to be performed, can be `transcribe` (return the text data
in the original language) or `translate` (translate the text to English).
- `--timestamps`: enable the timestamp mode where some timestamps are reported
for each recognized audio extracts.
- `--model`: the model to be used. Models that do not end with `-en` are
multilingual models, other ones are English only models. The supported models
are `tiny`, `tiny.en`, `base`, `base.en`, `small`, `small.en`, `medium`,
`medium.en`, `large`, and `large-v2`.