mirror of
https://github.com/huggingface/candle.git
synced 2025-06-16 10:38:54 +00:00

* Add the CSM model. * Add some code to load the model. * Load the text tokenizer. * Add frame generation. * Get the sampling to work. * Rope fix. * Autoregressive generation. * Generate some audio file. * Use the actual prompt. * Support multiple turns. * Add a very barebone readme. * Move some of the shared bits to the model.
Conversational Speech Model (CSM)
CSM is a speech generation model from Sesame, SesameAILabs/csm.
It can generate a conversational speech between two different speakers.
The speakers turn are delimited by the |
character in the prompt.
cargo run --example csm --features cuda -r -- \
--voices voices.safetensors \
--prompt "Hey how are you doing?|Pretty good, pretty good. How about you?"