mirror of
https://github.com/huggingface/candle.git
synced 2025-06-16 18:48:51 +00:00

* Add a RotatingKVCache. * Add some KvCache tests. * Test the reset too. * More kv-cache testing. * More tests for the rotating kv-cache. * Improve the api for the rotating cache so that the whole src tensor gets returned when it's overlarge. * Handle contiguity + bugfix + use in mimi. * Add a way to test the mimi streaming mode. * Mimi streaming fixes. * More rotating kv-cache. * Fix the attn mask generation. * Handle the abs case. * Add some tests for the generated mask.
candle-mimi
Mimi is a state of the art audio compression model using an encoder/decoder architecture with residual vector quantization. The candle implementation supports streaming meaning that it's possible to encode or decode a stream of audio tokens on the flight to provide low latency interaction with an audio model.
Running one example
Generating some audio tokens from an audio files.
wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
cargo run --example mimi --features mimi --release -- audio-to-code bria.mp3 bria.safetensors
And decoding the audio tokens back into a sound file.
cargo run --example mimi --features mimi --release -- code-to-audio bria.safetensors bria.wav