* Add the CSM model.
* Add some code to load the model.
* Load the text tokenizer.
* Add frame generation.
* Get the sampling to work.
* Rope fix.
* Autoregressive generation.
* Generate some audio file.
* Use the actual prompt.
* Support multiple turns.
* Add a very barebone readme.
* Move some of the shared bits to the model.
* added chatGLM readme
* changed wording in readme
* added readme for chinese-clip
* added readme for convmixer
* added readme for custom ops
* added readme for efficientnet
* added readme for llama
* added readme to mnist-training
* added readme to musicgen
* added readme to quantized-phi
* added readme to starcoder2
* added readme to whisper-microphone
* added readme to yi
* added readme to yolo-v3
* added readme to whisper-microphone
* added space to example in glm4 readme
* fixed mamba example readme to run mamba instead of mamba-minimal
* removed slash escape character
* changed moondream image to yolo-v8 example image
* added procedure for making the reinforcement-learning example work with a virtual environment on my machine
* added simple one line summaries to the example readmes without
* changed non-existant image to yolo example's bike.jpg
* added backslash to sam command
* removed trailing - from siglip
* added SoX to silero-vad example readme
* replaced procedure for uv on mac with warning that uv isn't currently compatible with pyo3
* added example to falcon readme
* added --which arg to stella-en-v5 readme
* fixed image path in vgg readme
* fixed the image path in the vit readme
* Update README.md
* Update README.md
* Update README.md
---------
Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
* Start updating to cudarc 0.14.
* Adapt a couple more things.
* And a couple more fixes.
* More tweaks.
* And a couple more fixes.
* Bump the major version number.
* Proper module system for the cuda kernels.
* Proper ptx loading.
* Launch the sort kernel.
* Custom op.
* Start using the builder pattern.
* More builder.
* More builder.
* Get candle-core to compile.
* Get the tests to pass.
* Get candle-nn to work too.
* Support for custom cuda functions.
* cudnn fixes.
* Get flash attn to run.
* Switch the crate versions to be alpha.
* Bump the ug dependency.
* added new language pairs to marian-mt
* lint
* seperated python code for converting tokenizers into its own file and and added a reqirements.txt for dependencies, updated instructions in readme and included python version
* Cleanup.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Update main.rs
* Update codegeex4_9b.rs
* Get things to compile.
* Add some default for when rope_ratio is missing.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Update the stable diffusion example with inpainting support for 1.5, 2 and XL.
* Apply cargo fmt.
* Clippy fixes.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com>
* add xlm-roberta-base
* Add task enum for fill-mask and reranker in xlm-roberta example; update README and fix attention mask dimensions
- Introduced a new `Task` enum to replace string task identifiers in the xlm-roberta example.
- Updated the logic in `main.rs` to handle tasks using the new enum.
- Enhanced README with example output for fill-mask task.
- Fixed dimension retrieval in `prepare_4d_attention_mask` function for better clarity and safety.
* Clippy fix.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com>
* Adds support for stella_en_v5 embedding model -400M variant
* Unified stella
* WIP: Unified Stella
* Combined stella for both 1.5B and 400M variants
* Cargo fmt for the CI
* removed redundant stella-400m model and example after merge into stella-en-v5
* cargo fmt --all
---------
Co-authored-by: Anubhab Bandyopadhyay <4890833+AnubhabB@users.noreply.github.com>
Co-authored-by: laurent <laurent.mazare@gmail.com>
At least on my macOS Sequoia system (MBP 14" 2021, M1 Pro), when I run
the `whisper-microphone` example after it has gathered 10 seconds of
audio, it fails before the transcription:
```
Error: Insufficient buffer size 384 for input channel 0, expected 1024
```
At least for the audio device I'm using (Airpods Pro Max), there is no
guarantee that each audio buffer is a multiple of 1024 samples. Thus at
the end of the 10 seconds, `buffered_pcm` can have some samples at the
end that do not form a complete 1024 sample chunk.
This fixes that by tracking when there is a partial chunk at the end of
the buffer, and leaving it in `buffered_pcm` to be processed on the next
loop iteration.
Note that, in the interest of keeping this PR as small as possible, I
didn't make any other changes to this example.
* Stable diffusion 3.5 support.
* Clippy fixes.
* CFG fix.
* Remove some unnecessary clones.
* Avoid duplicating some of the code.
* Release the mmdit model earlier to reduce memory usage.