mirror of https://github.com/huggingface/candle.git synced 2025-06-14 09:57:10 +00:00

Files

Laurent Mazare c58c5d5b01 Add the mimi audio-tokenizer. (#2488 )

* Add the mimi audio-tokenizer.

* Formatting tweaks.

* Add a full example.

* Use the transformers names.

* More renamings.

* Get encoding and decoding to work.

* Clippy fixes.

2024-09-20 14:31:20 -06:00

781 B

Raw Blame History

candle-mimi

Mimi is a state of the art audio compression model using an encoder/decoder architecture with residual vector quantization. The candle implementation supports streaming meaning that it's possible to encode or decode a stream of audio tokens on the flight to provide low latency interaction with an audio model.

Running one example

Generating some audio tokens from an audio files.

wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
cargo run --example mimi --features mimi --release -- audio-to-code bria.mp3 bria.safetensors

And decoding the audio tokens back into a sound file.

cargo run --example mimi --features mimi --release -- code-to-audio bria.safetensors bria.wav

781 B Raw Blame History

candle-mimi

Running one example

781 B

Raw Blame History