mirror of https://github.com/huggingface/candle.git synced 2025-06-16 18:48:51 +00:00

Files

Laurent Mazare d01207dbf3 Add a RotatingKVCache. (#2493 )

* Add a RotatingKVCache.

* Add some KvCache tests.

* Test the reset too.

* More kv-cache testing.

* More tests for the rotating kv-cache.

* Improve the api for the rotating cache so that the whole src tensor gets returned when it's overlarge.

* Handle contiguity + bugfix + use in mimi.

* Add a way to test the mimi streaming mode.

* Mimi streaming fixes.

* More rotating kv-cache.

* Fix the attn mask generation.

* Handle the abs case.

* Add some tests for the generated mask.

2024-09-23 13:14:32 +02:00

audio_io.rs

Add the mimi audio-tokenizer. (#2488 )

2024-09-20 14:31:20 -06:00

main.rs

Add a RotatingKVCache. (#2493 )

2024-09-23 13:14:32 +02:00

README.md

Add the mimi audio-tokenizer. (#2488 )

2024-09-20 14:31:20 -06:00

README.md

candle-mimi

Mimi is a state of the art audio compression model using an encoder/decoder architecture with residual vector quantization. The candle implementation supports streaming meaning that it's possible to encode or decode a stream of audio tokens on the flight to provide low latency interaction with an audio model.

Running one example

Generating some audio tokens from an audio files.

wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
cargo run --example mimi --features mimi --release -- audio-to-code bria.mp3 bria.safetensors

And decoding the audio tokens back into a sound file.

cargo run --example mimi --features mimi --release -- code-to-audio bria.safetensors bria.wav