mirror of
https://github.com/huggingface/candle.git
synced 2025-06-19 19:58:35 +00:00

* Add the CSM model. * Add some code to load the model. * Load the text tokenizer. * Add frame generation. * Get the sampling to work. * Rope fix. * Autoregressive generation. * Generate some audio file. * Use the actual prompt. * Support multiple turns. * Add a very barebone readme. * Move some of the shared bits to the model.
15 lines
457 B
Markdown
15 lines
457 B
Markdown
# Conversational Speech Model (CSM)
|
|
|
|
CSM is a speech generation model from Sesame,
|
|
[SesameAILabs/csm](https://github.com/SesameAILabs/csm).
|
|
|
|
It can generate a conversational speech between two different speakers.
|
|
The speakers turn are delimited by the `|` character in the prompt.
|
|
|
|
```bash
|
|
cargo run --example csm --features cuda -r -- \
|
|
--voices voices.safetensors \
|
|
--prompt "Hey how are you doing?|Pretty good, pretty good. How about you?"
|
|
```
|
|
|