Files
candle/candle-examples/examples/mimi
Laurent Mazare d01207dbf3 Add a RotatingKVCache. (#2493)
* Add a RotatingKVCache.

* Add some KvCache tests.

* Test the reset too.

* More kv-cache testing.

* More tests for the rotating kv-cache.

* Improve the api for the rotating cache so that the whole src tensor gets returned when it's overlarge.

* Handle contiguity + bugfix + use in mimi.

* Add a way to test the mimi streaming mode.

* Mimi streaming fixes.

* More rotating kv-cache.

* Fix the attn mask generation.

* Handle the abs case.

* Add some tests for the generated mask.
2024-09-23 13:14:32 +02:00
..
2024-09-23 13:14:32 +02:00

candle-mimi

Mimi is a state of the art audio compression model using an encoder/decoder architecture with residual vector quantization. The candle implementation supports streaming meaning that it's possible to encode or decode a stream of audio tokens on the flight to provide low latency interaction with an audio model.

Running one example

Generating some audio tokens from an audio files.

wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
cargo run --example mimi --features mimi --release -- audio-to-code bria.mp3 bria.safetensors

And decoding the audio tokens back into a sound file.

cargo run --example mimi --features mimi --release -- code-to-audio bria.safetensors bria.wav