candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	708e422456	Qwen MoE model. (#1960 ) * Qwen MoE model. * Add the MoE model to the example. * Fix the scaling. * Readme updates. * Readme tweaks.	2024-03-28 23:10:57 +01:00
Thomas Santerre	2bb9c683b9	Update README.md (#1840 ) Adds the candle-einops to the readme as an external resource	2024-03-13 14:36:25 +01:00
Laurent Mazare	6530932285	Add the new models to the main readme. (#1797 )	2024-03-03 16:25:14 +01:00
Laurent Mazare	64d4038e4f	Mention rwkv v6 in the readmes. (#1784 )	2024-03-01 08:58:30 +01:00
Jani Monoses	979deaca07	EfficientVit (MSRA) model (#1783 ) * Add EfficientVit (Microsoft Research Asia) model. * Mention models in README	2024-03-01 08:53:52 +01:00
Laurent Mazare	4fd00b8900	Add the StarCoder2 model. (#1779 ) * Add the StarCoder2 model. * Add the example code and get things to work. * And also tweak the readme.	2024-02-28 21:02:41 +01:00
Laurent Mazare	57267cd536	Add a flag to force running the quantized model on CPUs. (#1778 ) * Add a flag to force running the quantized model on CPUs. * Add encodec to the readme.	2024-02-28 14:58:42 +01:00
Laurent Mazare	45d5322d62	Add the Gemma models. (#1741 ) * Add the Gemma models. * Add the gemma example. * Adapt the RmsNorm. * Get the 2b model to work. * 7b support. * Use the config head dim. * Yet another fix. * Make the matrixes contiguous. * Also get the 7b model to work. * And add to the readme.	2024-02-21 22:02:50 +01:00
Laurent Mazare	2d5f2a728d	Add the RWKV model (v5). (#1707 ) * Start adding the RWKV model. * More of the forward step. * Handle rescaling. * FeedForward. * More work on RWKV. * Better state tracking. * Finish a first pass on forward. * Fix the shape mismatches. * Do not rescale in f32. * Rename to rwkv-v5. * Add the new models to the readme.	2024-02-14 10:58:32 +01:00
Laurent Mazare	1e26d539d9	Improved mamba model optimized for inference (#1694 ) * Sketch the mamba model for inference. * Complete the forward pass. * Add the mamba example. * Optimize the selective-scan part. * Fix a couple shape mismatches and get inference to work. * Tweak the readmes. * More readme tweaks.	2024-02-11 17:04:57 +01:00
Laurent Mazare	27ffd644a9	Mention TrOCR in the readmes. (#1691 )	2024-02-10 15:49:38 +01:00
Laurent Mazare	a510ddec4e	Mention the new models in the readme. (#1651 )	2024-02-03 15:19:57 +01:00
SebastianRueClausen	a46864bd56	Fix "Minimal Mamba" link in README. (#1577 )	2024-01-12 17:47:07 +01:00
Laurent Mazare	8e06bfb4fd	Mention VGG in the readme. (#1573 )	2024-01-12 09:59:29 +01:00
Jeroen Vlek	3a7304cb0d	add link to gpt-from-scratch-rs (#1525 )	2024-01-05 11:59:46 +01:00
Laurent Mazare	65cb90bd40	Add some mention to SOLAR-10.7B in the readme. (#1487 )	2023-12-27 15:25:39 +01:00
Laurent Mazare	d8b9a727fc	Support different mamba models. (#1471 )	2023-12-23 10:46:02 +01:00
Laurent Mazare	1e86717bf2	Fix a couple typos (#1451 ) * Mixtral quantized instruct. * Fix a couple typos.	2023-12-17 05:20:05 -06:00
Laurent Mazare	cfdf9640a3	Readme tweaks. (#1446 )	2023-12-16 06:23:12 -06:00
Laurent Mazare	e12cbfd73b	Update the readme to mention mixtral. (#1443 )	2023-12-15 19:29:03 -06:00
Laurent Mazare	7be982f6f7	Mention phi-2 in the readme. (#1434 )	2023-12-14 08:02:27 -06:00
Edwin Cheng	37bf1ed012	Stable Diffusion Turbo Support (#1395 ) * Add support for SD Turbo * Set Leading as default in euler_ancestral discrete * Use the appropriate default values for n_steps and guidance_scale. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2023-12-03 08:37:10 +01:00
Eric Buehler	f83e14f68d	Add candle-lora transformers to readme? (#1356 ) * Demonstrate lora transformers in readme * Shorten readme	2023-11-21 17:54:24 +00:00
Laurent Mazare	c7e613ab5e	Update the readme. (#1354 )	2023-11-21 09:38:27 +00:00
Laurent Mazare	8f63f68289	Fix the kalosm link (#1353 )	2023-11-21 06:18:14 +01:00
Laurent Mazare	f1e678b39c	Mention the Yi-6b/Yi-34b models in the readme. (#1321 )	2023-11-11 12:39:11 +01:00
Juarez Bochi	18d30005c5	Add support to UL2 model family (#1300 ) * Add support to UL2 model family * Update docs with UL2 * Create ActivationWithOptionalGating to avoid polluting activations * Also refactor quantized t5 * Remove useless conversion * Revert Activation::NewGelu name change * Remove useless return * Apply rustfmt and clippy recommendations * Reuse t5::ActivationWithOptionalGating in quantized version * (cosmetic change) use a match rather than ifs + avoid early returns. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2023-11-09 18:55:09 +01:00
Juarez Bochi	c912d24570	Update README: Move T5 to Text to Text section (#1288 ) I think it makes more sense to have it there, since it's a seq2seq model with cross attention, and not a LM. There are also Decoder only T5 models that work as LMs, but that's not the standard.	2023-11-07 16:14:04 +01:00
Juarez Bochi	d5c2a7b64b	Add info about MADLAD-400 in readme files (#1287 )	2023-11-07 15:21:59 +01:00
Eric Buehler	abc4f698c5	Add candle-sampling (#1278 )	2023-11-06 12:53:29 +01:00
YiiSh	a923e8b53a	Add a link to candle-ext to README.md (#1277 )	2023-11-06 12:44:39 +01:00
Laurent Mazare	2a45bcf943	Put the onnx example behind a feature flag. (#1276 ) * Put the onnx example behind a feature flag. * Exclude the onnx bits from the workspace. * README tweaks.	2023-11-06 07:45:07 +01:00
figgefigge	47f4ddb011	Added info about missing protoc (#1275 ) Co-authored-by: figgefigge <fredric.1337mail.com>	2023-11-06 06:47:32 +01:00
Yuchao Zhang	bfe95115c6	Update README.md (#1264 )	2023-11-04 05:32:32 +01:00
ealmloff	ad63f20781	add Kalosm to the list of external resources (#1257 )	2023-11-03 19:16:46 +01:00
Eric Buehler	1b5063f3ca	Add vllm external resource (#1253 )	2023-11-03 12:40:31 +01:00
Laurent Mazare	4c967b9184	Use the hub files for the marian example. (#1220 ) * Use the hub files for the marian example. * Use the secondary decoder. * Add a readme. * More readme.	2023-10-30 17:29:36 +00:00
Laurent Mazare	0ec5ebcec4	Use the hub model file when possible. (#1190 ) * Use the hub model file when possible. * And add a mention in the main readme.	2023-10-26 20:00:50 +01:00
Blanchon	e37b487767	Add Blip to online demos README.md (#1184 ) * Add Blip to online demos README.md * Punctuation. --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>	2023-10-26 11:07:01 +01:00
Laurent Mazare	e7b886d56f	Add a link to the optimisers crate. (#1180 )	2023-10-25 21:51:45 +01:00
Laurent Mazare	df2f89b6cf	Add some KV cache to blip. (#1150 ) * Add some KV cache to blip. * Mention BLIP in the readme.	2023-10-22 09:44:48 +01:00
Laurent Mazare	31ca4897bb	Readme updates. (#1134 )	2023-10-20 09:08:39 +01:00
Laurent Mazare	93c25e8844	Expose the larger resnets (50/101/152) in the example. (#1131 )	2023-10-19 13:48:28 +01:00
Laurent Mazare	6f76383f38	Add a readme for the resnet example. (#1129 )	2023-10-19 09:58:50 +01:00
Laurent Mazare	63c204c79e	Add a mention to the replit-code model in the readme. (#1121 )	2023-10-18 11:27:23 +01:00
Laurent Mazare	8921d5027c	Add support for phi-1.0 (#1093 ) * Add support for phi-1.0 * Update the readme.	2023-10-14 20:15:43 +01:00
Laurent Mazare	e7560443e4	Convmixer example (#1074 ) * Add a convmixer based example. * Mention the model in the readme.	2023-10-11 19:51:10 +01:00
Laurent Mazare	955e00b2e8	Add to the readmes for stable-lm. (#1047 )	2023-10-06 21:26:04 +01:00
Laurent Mazare	0ac2db577b	Add an entry about WSL slowness to the faq. (#997 )	2023-09-29 17:04:52 +01:00
Laurent Mazare	49fa184a35	Mistral readme (#994 ) * Mistral: print the generated text. * Add mistral to the readmes.	2023-09-29 11:50:50 +01:00

1 2 3

122 Commits