mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Files

Laurent Mazare cf9d7bf24c Add the CSM model. (#2862 )

* Add the CSM model.

* Add some code to load the model.

* Load the text tokenizer.

* Add frame generation.

* Get the sampling to work.

* Rope fix.

* Autoregressive generation.

* Generate some audio file.

* Use the actual prompt.

* Support multiple turns.

* Add a very barebone readme.

* Move some of the shared bits to the model.

2025-04-04 06:48:03 +02:00

main.rs

Add the CSM model. (#2862 )

2025-04-04 06:48:03 +02:00

README.md

Add the CSM model. (#2862 )

2025-04-04 06:48:03 +02:00

README.md

Conversational Speech Model (CSM)

CSM is a speech generation model from Sesame, SesameAILabs/csm.

It can generate a conversational speech between two different speakers. The speakers turn are delimited by the | character in the prompt.

cargo run --example csm --features cuda -r -- \
    --voices voices.safetensors  \
    --prompt "Hey how are you doing?|Pretty good, pretty good. How about you?"