Module Docs (#2624)

* update whisper * update llama2c * update t5 * update phi and t5 * add a blip model * qlamma doc * add two new docs * add docs and emoji * additional models * openclip * pixtral * edits on the model docs * update yu * update a fe wmore models * add persimmon * add model-level doc * names * update module doc * links in heira * remove empty URL * update more hyperlinks * updated hyperlinks * more links * Update mod.rs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
2025-06-22 04:22:50 +00:00 · 2024-11-18 08:19:23 -05:00
parent 12d7e7b145
commit 386fd8abb4
39 changed files with 170 additions and 115 deletions
--- a/candle-transformers/src/models/mimi/mod.rs
+++ b/candle-transformers/src/models/mimi/mod.rs
@ -1,9 +1,27 @@
 //! mimi model
 //!
-//! Mimi is a state-of-the-art audio neural codec.
+//! [Mimi](https://huggingface.co/kyutai/mimi) is a state of the art audio
+//! compression model using an encoder/decoder architecture with residual vector
+//! quantization. The candle implementation supports streaming meaning that it's
+//! possible to encode or decode a stream of audio tokens on the flight to provide
+//! low latency interaction with an audio model.
 //!
-//! - [HuggingFace Model Card](https://huggingface.co/kyutai/mimi)
-//! - [GitHub](https://github.com/kyutai-labs/moshi)
+//! - 🤗 [HuggingFace Model Card](https://huggingface.co/kyutai/mimi)
+//! - 💻 [GitHub](https://github.com/kyutai-labs/moshi)
+//!
+//!
+//! # Example
+//! ```bash
+//! # Generating some audio tokens from an audio files.
+//! wget https://github.com/metavoiceio/metavoice-src/raw/main/assets/bria.mp3
+//! cargo run --example mimi \
+//!   --features mimi --release -- \
+//!   audio-to-code bria.mp3 bria.safetensors
+//!
+//! # And decoding the audio tokens back into a sound file.
+//! cargo run --example mimi
+//!   --features mimi --release -- \
+//!   code-to-audio bria.safetensors bria.wav
 //!

 // Copyright (c) Kyutai, all rights reserved.