candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	8cc0a183ba	Speaker embeddings computation for metavoice. (#1800 ) * Speaker embeddings computation for metavoice. * Compute the speaker embeddings.	2024-03-04 14:13:01 +01:00
Jiayu Liu	924ccae30c	Add an initial Segformer implementation (#1617 ) * add segformer * Make the id2label field optional. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-03-03 16:01:46 +01:00
Laurent Mazare	60dc72b96b	More metavoice tweaks. (#1796 )	2024-03-03 15:05:25 +01:00
Laurent Mazare	4fff5b51f5	Metavoice - first cut (#1717 ) * Add the metavoice transformer. * Sketch the speaker-encoder module. * Adding to the metavoice model. * Start adding the metavoice example. * Get some logits out. * Load the second stage model. * Get the second step to run. * Tweak the example. * Add encodec tilting. * Glue the different bits together. * Fix a shape issue. * Use a constant. * BPE tokenization. * Add a warning.	2024-03-02 18:50:01 +01:00
Laurent Mazare	314630638d	Rustfmt fix. (#1788 )	2024-03-02 10:35:07 +01:00
Frkri	3e3def4134	Update StableLM config (#1787 )	2024-03-02 09:56:57 +01:00
Jani Monoses	979deaca07	EfficientVit (MSRA) model (#1783 ) * Add EfficientVit (Microsoft Research Asia) model. * Mention models in README	2024-03-01 08:53:52 +01:00
Jack Shih	b485e4b6ee	add models of rwkv v6 and quantized rwkv v6 (#1781 ) * add models of rwkv v6 and quantized rwkv v6 * fix ci clippy fail	2024-03-01 08:37:56 +01:00
Laurent Mazare	4fd00b8900	Add the StarCoder2 model. (#1779 ) * Add the StarCoder2 model. * Add the example code and get things to work. * And also tweak the readme.	2024-02-28 21:02:41 +01:00
Laurent Mazare	d0aca6c3c6	Encodec encoding demo. (#1775 )	2024-02-28 06:49:03 +01:00
Laurent Mazare	15e8644149	Apply dilations in the encodec model. (#1772 ) * Apply dilations in the encodec model. * Add some encoding bits.	2024-02-27 23:26:35 +01:00
Laurent Mazare	0c49e95dfb	Encodec model. (#1771 ) * Encodec model. * Fixes. * Add the padding functions. * Get the LSTM bit to work. * Get the encodec model to generate some tokens (decoder only for now). * Minor tweak. * Minor tweak.	2024-02-27 22:59:40 +01:00
Laurent Mazare	205767f9de	Avoid tensor copying in the quantized example. (#1770 )	2024-02-27 20:32:30 +01:00
Jack Shih	918136ba46	add quantized rwkv v5 model (#1743 ) * and quantized rwkv v5 model * Integrate the quantized rwkv model in the initial example. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-02-25 21:43:40 +01:00
Laurent Mazare	1a6043af51	Tweak the VarMap set type. (#1758 )	2024-02-25 20:50:08 +01:00
Laurent Mazare	28057781aa	Make the cache for the llama model explicit too. (#1745 )	2024-02-22 12:04:33 +01:00
laurent	544018b6d0	Explicit caching in llama2.c.	2024-02-22 10:22:03 +01:00
Laurent Mazare	c753f72c85	Support for attention bias in gemma + refactor things a bit. (#1744 ) * Support for attention bias in gemma + refactor things a bit. * Fix the cuda tests.	2024-02-22 09:35:28 +01:00
Laurent Mazare	45d5322d62	Add the Gemma models. (#1741 ) * Add the Gemma models. * Add the gemma example. * Adapt the RmsNorm. * Get the 2b model to work. * 7b support. * Use the config head dim. * Yet another fix. * Make the matrixes contiguous. * Also get the 7b model to work. * And add to the readme.	2024-02-21 22:02:50 +01:00
Laurent Mazare	5ebcfeaf0f	Make the r, k, v tensors contiguous. (#1719 )	2024-02-16 09:17:35 +01:00
Laurent Mazare	26fe162ab5	Custom tokenizer for rwkv. (#1711 ) * Custom tokenizer for rwkv. * Custom tokenizer. * Getting the tokenizer to work.	2024-02-14 15:11:38 +01:00
Laurent Mazare	2d5f2a728d	Add the RWKV model (v5). (#1707 ) * Start adding the RWKV model. * More of the forward step. * Handle rescaling. * FeedForward. * More work on RWKV. * Better state tracking. * Finish a first pass on forward. * Fix the shape mismatches. * Do not rescale in f32. * Rename to rwkv-v5. * Add the new models to the readme.	2024-02-14 10:58:32 +01:00
Jani Monoses	68f7655895	Add ConvNeXt-V2 and smaller model variants. (#1709 )	2024-02-14 10:53:07 +01:00
Nicolas Patry	c1b418586c	Fixing quantized llama demo on metal. (#1703 )	2024-02-13 16:28:56 +01:00
drbh	13c67226e6	feat: support microphone whisper streaming (#1678 ) * feat: support microphone whisper streaming * fix: cleanup print stmts and adjust how input is read * fix: remove incorrect comment * feat: split into new example and simplify * fix: feature flag example file * fix: fmt fixes * feat: simplify and remove redundant files	2024-02-12 18:01:21 +01:00
Laurent Mazare	1e26d539d9	Improved mamba model optimized for inference (#1694 ) * Sketch the mamba model for inference. * Complete the forward pass. * Add the mamba example. * Optimize the selective-scan part. * Fix a couple shape mismatches and get inference to work. * Tweak the readmes. * More readme tweaks.	2024-02-11 17:04:57 +01:00
Laurent Mazare	bf20cc854c	Support sinusoidal embeddings in trocr. (#1690 ) * Support sinusoidal embeddings in trocr. * Support tie-word-embeddings.	2024-02-10 15:17:51 +01:00
Laurent Mazare	42ce593ec6	Use the repo config for trocr rather than hardcoding it + small tweaks. (#1689 ) * Use the repo config for trocr rather than hardcoding it + small tweaks. * Add support for the printed models. * Fail with an appropriate error message on missing position embeddings.	2024-02-10 13:15:03 +01:00
Laurent Mazare	67589791d2	Remove the unused pragma in vit + handle the final layernorm. (#1688 )	2024-02-10 11:08:50 +01:00
Laurent Mazare	5657e596cd	Add the Qwen2 model (#1684 ) * Initial check-in for the qwen2 model. * More qwen2 inference. * Polish the qwen example. * Fix the rope basis. * Get the inference to work. * Support different model sizes.	2024-02-09 15:02:49 +01:00
Laurent Mazare	0dee8ea19b	Add the ChatGLM model. (#1237 ) * Add the ChatGLM model. * Rotary embeddings. * Add to the forward pass. * Add to the forward pass. * Add the rotary embeddings. * Add the KV cache. * Add the chatglm example. * Bugfix. * More glm fixes. * Fix some shape issues. * Get the inference to work.	2024-02-09 11:51:38 +01:00
drbh	9cadd4e644	feat: support multithread spectrogram and small perf tweaks (#1674 ) * feat: support multithread spectrogram and small perf tweaks * feat: clippy improvement for loop variable * fix: add back speed up scale down logic * fix: readd mirroring logic * feat: prefer scoped thread and simplify/improve logic/traits	2024-02-08 21:54:12 +01:00
Laurent Mazare	50be8a98ba	Quantized support for stable-lm2. (#1654 ) * Quantized support for stable-lm2. * Quantized support for v2-zephyr.	2024-02-04 11:57:05 +01:00
Daniel Clough	58cc896e69	make llama derive clone (#1648 ) Co-authored-by: danielclough <danielclough@users.noreply.github.com>	2024-02-04 11:56:03 +01:00
Jani Monoses	d32abbce53	Add StableLM-2, StableLM Code and Zephyr variants (#1650 ) * Add StableLM Code and Zephyr variants * Add V2 models * Update README	2024-02-03 14:58:41 +01:00
Hubert Shelley	dfab45e1c8	Supports more audio formats (#1628 ) * Supports more audio formats * Simplify the handling of the different buffer types. * Check the sample rate. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-02-03 14:26:04 +01:00
Bayang	96bc704d17	Update mixformer.rs (#1601 ) Update the source of the configuration_mixformer_sequential.py It has been removed, therefore, it is still available in this -> d38e6f954ec29b96fe2cf033937dad64e279b5d9	2024-02-03 13:42:16 +01:00
Jani Monoses	a52d407ae6	Add ConvNeXt model. (#1604 )	2024-02-03 13:34:28 +01:00
Nicolas Patry	403680f17d	Quantized GGUF style (#1523 ) * Metal quantized modifications proposal. - Add a device param, wherever needed. - Create new QMetal storage thing that implements QuantizedType. - Update everywhere needed. Fix Python. Fixing examples. Fix: fmt + clippy + stub. Moving everything around. Only missing the actual implems. Fixing everything + adding dequantized kernels. More work. Fixing matmul. Fmt + Clippy Some clippy fixes. Working state. Q2K Metal -> Bugged (also present in GGML). Q4K CPU -> Bugged (present previously, new test catch it). Q5K CPU -> Bugged (present previously). Q8_1 Both -> Never really implemented it seems Q8K metal -> Never implemented in metal Fixing Q2K bug (present in ggml). * Cleanup. * Fix the rebase. * Removing the fences speeds everything up and is correct this time... * Cleanup the fence. * After rebase. * Bad code removal. * Rebase after phi2 merge + fix replit default to CPU. * Making the CI happy. * More happy tests. --------- Co-authored-by: Nicolas Patry <nicolas@Nicolass-MacBook-Pro.local>	2024-01-17 10:27:58 +01:00
Jani Monoses	5270224f40	Add MobileOne model. (#1595 ) * Add MobileOne model. * Clippy fixes * Remove a comment. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-01-16 06:34:16 +01:00
Laurent Mazare	88618255cb	Fix the rotary embeddings for the new phi implementation. (#1582 ) * Fix the rotary embeddings for the new phi implementation. * Match the activation. * KV cache fix. * Use the config activation function.	2024-01-13 19:44:41 +01:00
Laurent Mazare	539ead927a	Update the Phi model to use the updated architecture. (#1580 ) * Update the Phi model to use the updated architecture. * Add more of the phi model. * Repeat KV + caching. * Apply the rotary embeddings. * Add support for the new phi model in the phi example. * Fix a couple glitches. * Fix a couple more glitches.	2024-01-13 17:38:27 +01:00
Jani Monoses	2480c5dbdd	Add RepVGG model. (#1561 ) * Add RepVGG model. * Add RepVGG README * Extract var to top level * Replace hashmap with a match * Add a variant for the model kind + avoid some unnecessary config cloning. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2024-01-11 07:07:40 +01:00
Jani Monoses	63944714f2	Use candle_nn::embedding instead of local copies in a few models. (#1562 )	2024-01-10 21:36:27 +01:00
Nicolas Patry	b4cb982e49	Simplifying our internal cargo dependencies. (#1529 )	2024-01-07 12:04:14 +01:00
Laurent Mazare	b0fe5e4453	Do not implement Module for BatchNorm. (#1513 )	2024-01-01 10:13:13 +01:00
Laurent Mazare	1e442d4bb9	Fix lints for clippy 1.75. (#1494 )	2023-12-28 20:26:20 +01:00
Daniel Clough	cd889c0f8a	add config_amazon_mistral_lite (#1493 ) Co-authored-by: Ubuntu <danielclough@users.noreply.github.com>	2023-12-28 19:59:58 +01:00
Laurent Mazare	d35f0a1376	Bump the crate version to 0.3.3. (#1490 )	2023-12-28 13:38:30 +01:00
drbh	f6408a3779	feat: add clear_kv_cache to mistral and qmistral models (#1464 )	2023-12-21 21:19:19 +01:00

1 2 3 4 5

225 Commits