* Initial commit: model weights working, prediciton incorrect
* moved distilbertformaskedlm into distilbert modeling file
* made maskedLM like bert example, still incorrect predictions
* finally not getting NaNs, fixed attention mask
* getting correct output sentences
* get top k predictions
* fixed output formatting slightly
* added default arg for model_id
* lint
* moved masked token example code from distilbertformaskedlm example to distilbert example
* lint
* removed distilbertformaskedlm example
* cleanup
* clippy
* removed embedding normalization from example
* made output and model dependent on args instead of prompt
* lint
* replaced or_ok anyhow error with anyhow context
* changed error message for mask token not found
* Add the SNAC audio tokenizer.
* More snac.
* Again more snac.
* Add some example code for snac.
* Get the weights to load.
* Add to the snac model.
* Fixes.
* Get round-tripping to work.
* Save/load code files.
* Clippy fix.
* Fmt fix.
* Add the CSM model.
* Add some code to load the model.
* Load the text tokenizer.
* Add frame generation.
* Get the sampling to work.
* Rope fix.
* Autoregressive generation.
* Generate some audio file.
* Use the actual prompt.
* Support multiple turns.
* Add a very barebone readme.
* Move some of the shared bits to the model.
* added new language pairs to marian-mt
* lint
* seperated python code for converting tokenizers into its own file and and added a reqirements.txt for dependencies, updated instructions in readme and included python version
* Cleanup.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Update main.rs
* Update codegeex4_9b.rs
* Get things to compile.
* Add some default for when rope_ratio is missing.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
* add xlm-roberta-base
* Add task enum for fill-mask and reranker in xlm-roberta example; update README and fix attention mask dimensions
- Introduced a new `Task` enum to replace string task identifiers in the xlm-roberta example.
- Updated the logic in `main.rs` to handle tasks using the new enum.
- Enhanced README with example output for fill-mask task.
- Fixed dimension retrieval in `prepare_4d_attention_mask` function for better clarity and safety.
* Clippy fix.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com>
* Fix bug in whisper transformer
- due to num_threads going to zero
in single threaded case
* Apply rustfmt.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
* change: BertEncoder struct to public
* change: make certain fields in Config struct public
* change: all fields in bert config struct to be public
* change: add clone to bert encoder and others
* Clippy fix.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Adds support for stella_en_v5 embedding model -400M variant
* Unified stella
* WIP: Unified Stella
* Combined stella for both 1.5B and 400M variants
* Cargo fmt for the CI
* removed redundant stella-400m model and example after merge into stella-en-v5
* cargo fmt --all
---------
Co-authored-by: Anubhab Bandyopadhyay <4890833+AnubhabB@users.noreply.github.com>
Co-authored-by: laurent <laurent.mazare@gmail.com>
* module docs
* varbuilder gguf docs
* add a link to gguf files
* small additonal mod doc titles
* safetensor docs
* more core docs
* more module docs in canlde_core
* 2 more link fixes
* dinov2
* add another example
* ad dinov2reg4
* eva2
* efficientvit
* moondream
* update t5
* update t5
* rwkv
* stable diffusion docs
* add wasm link
* add segment_anything
* adjsut for clippy
* ignore bertdoc
* dinov2 ignore
* update block to be text
* remove the rust blocks for the moment
* bump python to 3.11
* add a setup-python step
* add py311 to test as well
* links in chinese_clip
* links for clip model
* add mod docs for flux and llava
* module doc for MMDIT and MIMI
* add docs for a few more modesl
* mod docs for bert naser and beit
* add module docs for convmixer colpali codegeex and chatglm
* add another series of moddocs
* add fastvit-llama2_c
* module docs mamba -> mobileone
* module docs from moondream-phi3
* mod docs for quantized and qwen
* update to yi
* fix long names
* Update llama2_c.rs
* Update llama2_c_weights.rs
* Fix the link for mimi + tweaks
---------
Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
* Add some fast Metal MLX SDPA kernels (#32)
* Sketch the sdpa kernel
* Add full sdpa kernel,
* Add test
* Add vectorized kernel for decoding
* Update tests
* Add some docs
* Fix sdpa_vector names
* Add softcapping for vectorized sdpa
* Add softcapping for full sdpa
* Add support for head dim 32, 96, 256
* Add support for head dim 32, 96, 256
* Update docs
* Add update notice
* Clippy and format
* Conditional compilation for bf16
* Use it in quantized llama
* Some review comments
* Use set_params!
* Remove unused
* Remove feature
* Fix metal sdpa for v stride
* Remove comma
* Add the dim method to layout and shape.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com>
* Stella_en_1.5B_v5
* Separated creation. This is a critical step for numerical accuracy and would be documented in the readme
* EmbedDim would require clone and copy
* WIP: example
* Examples added
* a litte more in README