candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	e08fbb6543	Add support for distil whisper (#1245 ) * Add support for distil-whisper. * Add distil-large. * Rename the large model.	2023-11-02 19:32:35 +01:00
Laurent Mazare	c12ad45562	Add a KV cache to marian decoding. (#1226 )	2023-10-31 08:47:44 +00:00
Laurent Mazare	7d0202710b	Instructions for generating the tokenizer configs for marian-mt. (#1225 )	2023-10-31 07:56:26 +01:00
Laurent Mazare	392a00a147	Add support for the marian base model. (#1221 )	2023-10-30 19:20:36 +00:00
Laurent Mazare	4c967b9184	Use the hub files for the marian example. (#1220 ) * Use the hub files for the marian example. * Use the secondary decoder. * Add a readme. * More readme.	2023-10-30 17:29:36 +00:00
Laurent Mazare	969960847a	Bugfixes for marian-mt. (#1219 ) * Bugfixes for marian-mt. * Apply the final decoding head. * More fixes.	2023-10-30 11:44:19 +00:00
Lukas Kreussel	174b208052	PyO3: Better shape handling (#1143 ) * Negative and `args` shape handling Rename to `PyShapeWithHole` + validate that only one hole exists * Regenerate stubs --------- Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>	2023-10-29 15:41:44 +00:00
Laurent Mazare	7bbde55c61	Marian MT model (#1210 ) * Skeleton files for the marian MT model. * Marian initialization. * Implement the attention forward method. * Forward pass for the encoder side. * Expose the encoder and decoder. * Start plugging the decoder. * Forward pass for the decoder layer. * Set up the marian example. * Add some missing backtraces. * Bugfix.	2023-10-29 15:12:22 +00:00
Laurent Mazare	55bc3382cf	Allow for different behavior between training and eval (#1213 ) * Forward with training. * Do not use dropout on vgg evaluation.	2023-10-29 07:53:09 +01:00
drbh	dece37c6f4	feat: implement VGG13, VGG16 and VGG19 (#1211 ) * feat: implement VGG13, VGG16 and VGG19 * Cosmetic fixes. * More cosmetic tweaks + avoid re-loading the weights on each final layer. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2023-10-29 06:10:23 +00:00
Travis Hammond	498c50348c	Add DDPG and fix Gym wrapper (#1207 ) * Fix Gym wrapper - It was returning things in the wrong order - Gym now differentiates between terminated and truncated * Add DDPG * Apply fixes * Remove Result annotations * Also remove Vec annotation * rustfmt * Various small improvements (avoid cloning, mutability, get clippy to pass, ...) --------- Co-authored-by: Travis Hammond <travis.hammond@alexanderthamm.com> Co-authored-by: Laurent <laurent.mazare@gmail.com>	2023-10-28 19:53:34 +01:00
Laurent Mazare	012ae0090e	Infer the config for llama2-c. (#1208 )	2023-10-28 19:00:39 +01:00
Laurent Mazare	95a857cf57	Move the llama2-c model in transformers. (#1205 )	2023-10-28 16:51:19 +01:00
jamjamjon	b3181455d5	Add fuse-conv-bn method for Conv2d (#1196 ) * Add fuse-conv-bn method for Conv2d * no unwrap * run rustfmp and clippy	2023-10-27 15:56:50 +01:00
Laurent Mazare	e2826e70b3	Add a quantized variant of llama2.c (#1197 ) * Add a quantized variant of llama2.c * Clippy fixes.	2023-10-27 15:34:06 +01:00
Laurent Mazare	70d06ab4b0	Add support for the phi-hermes finetuned model. (#1192 )	2023-10-27 05:57:08 +01:00
Laurent Mazare	0ec5ebcec4	Use the hub model file when possible. (#1190 ) * Use the hub model file when possible. * And add a mention in the main readme.	2023-10-26 20:00:50 +01:00
Laurent Mazare	5f20697918	Add the jina-bert embeddings model. (#1187 ) * Add the jina-bert model. * Use alibi. * Remove the unused pragma. * Recompute the alibi embeddings. * Generate the token type ids. * Use the module trait. * Add the jina-bert example. * DType fix. * Get the inference to work.	2023-10-26 16:54:36 +01:00
Laurent Mazare	25c3cc4149	Mention the flash-attention restriction in the readme. (#1158 )	2023-10-23 10:26:56 +01:00
Laurent Mazare	a11af79e23	Add a quantized blip model. (#1155 ) * Add a quantized blip model. * Integrate the quantized blip model to the actual example.	2023-10-22 20:33:25 +01:00
Laurent Mazare	8a82d623e5	Handle LongStorage in pytorch checkpoints. (#1152 )	2023-10-22 18:34:36 +01:00
Laurent Mazare	df2f89b6cf	Add some KV cache to blip. (#1150 ) * Add some KV cache to blip. * Mention BLIP in the readme.	2023-10-22 09:44:48 +01:00
Laurent Mazare	3115fe42e4	Blip attention mask + readme (#1146 ) * Add the attention mask to the blip model. * Add a readme.	2023-10-21 22:44:13 +01:00
Laurent Mazare	2531b13bf8	Blip fixes (#1145 ) * Some fixes for the blip example. * Stop generating on sep tokens. * Clippy fixes. * rustfmt.	2023-10-21 21:34:48 +01:00
Laurent Mazare	0d9bb4eb18	Add the blip example. (#1144 ) * Add the blip example. * Tweak the example. * Implement the cross-attn logic. * Fix some shape mismatches. * Get some logits out. * Get some caption to be generated.	2023-10-21 20:05:02 +01:00
Laurent Mazare	7366aeac21	Make func cloneable. (#1137 )	2023-10-20 16:28:50 +01:00
Laurent Mazare	31ca4897bb	Readme updates. (#1134 )	2023-10-20 09:08:39 +01:00
Laurent Mazare	55351ef57d	Add some vision transformers models (#1132 ) * Start adding vision-transformers. * Add self-attn. * More vision transformers. * vit-vit. * Add the actual vit model. * Add the example code for the vision transformers.	2023-10-19 22:24:18 +01:00
Laurent Mazare	93c25e8844	Expose the larger resnets (50/101/152) in the example. (#1131 )	2023-10-19 13:48:28 +01:00
Laurent Mazare	6f76383f38	Add a readme for the resnet example. (#1129 )	2023-10-19 09:58:50 +01:00
Laurent Mazare	8e773cc0c6	Experiment with resnet (#1128 ) * Add some preliminary support for resnet. * Add an actual resnet example.	2023-10-19 09:25:03 +01:00
Laurent Mazare	620c94d12e	Add support for Zephyr-7b in the quantized model. (#1124 )	2023-10-18 17:31:26 +01:00
Laurent Mazare	86e7d539d2	Add the quantized mpt model. (#1123 ) * Add the quantized mpt model. * Support the quantized model for replit-code.	2023-10-18 16:29:38 +01:00
Laurent Mazare	63c204c79e	Add a mention to the replit-code model in the readme. (#1121 )	2023-10-18 11:27:23 +01:00
Laurent Mazare	767a6578f1	MPT alibi fixes. (#1120 ) * MPT alibi fixes. * Some more fixes. * Finally get the model to return some sensible outputs. * Add a readme.	2023-10-18 10:58:05 +01:00
Laurent Mazare	2cd745a97c	MPT fixes. (#1117 ) * MPT fixes. * Another couple fixes. * Another shape fix.	2023-10-17 21:53:31 +01:00
Laurent Mazare	a72b50e2c0	Build alibi bias. (#1115 ) * Build alibi bias. * Apply the alibi attention bias. * Add the replit-code example.	2023-10-17 20:41:37 +01:00
Laurent Mazare	00948eb656	Formatting tweak. (#1111 )	2023-10-16 21:02:53 +01:00
Laurent Mazare	af67672207	Add support for Puffin-Phi-v2. (#1110 ) * Add support for Puffin-Phi-v2. * Tweak the file name. * Support the config for puffin-phi-v2. * Update the readme.	2023-10-16 20:54:21 +01:00
Laurent Mazare	588ad4835a	Fix the verbose prompt for phi. (#1097 )	2023-10-15 10:53:25 +01:00
Laurent Mazare	b73c35cc57	Improve the reshape error messages. (#1096 ) * Improve the reshape error messages. * Add the verbose-prompt flag to the phi example.	2023-10-15 10:43:10 +01:00
Laurent Mazare	8921d5027c	Add support for phi-1.0 (#1093 ) * Add support for phi-1.0 * Update the readme.	2023-10-14 20:15:43 +01:00
Laurent Mazare	29c7f2565d	Add some reinforcement learning example. (#1090 ) * Add some reinforcement learning example. * Python initialization. * Get the example to run. * Vectorized gym envs for the atari wrappers. * Get some simulation loop to run.	2023-10-14 16:46:43 +01:00
Laurent Mazare	e7560443e4	Convmixer example (#1074 ) * Add a convmixer based example. * Mention the model in the readme.	2023-10-11 19:51:10 +01:00
Laurent Mazare	b34d7f0248	Remove some unusued bits. (#1067 )	2023-10-09 19:49:57 +01:00
Laurent Mazare	4d04ac83c7	Override the repo for SDXL f16 vae weights. (#1064 ) * Override the repo for SDXL f16 vae weights. * Slightly simpler change.	2023-10-09 06:52:28 +01:00
Laurent Mazare	59ab6d7832	Quantized version of StableLM. (#1058 ) * Quantized version of StableLM. * Adapt the stable-lm example to support quantizsed. * Use some separate hub repo. * Another repo name tweak.	2023-10-08 15:42:38 +01:00
Laurent Mazare	2e5fb0b251	Do not use the kv-cache on external key-value states. (#1054 )	2023-10-07 22:37:19 +01:00
Laurent Mazare	823fe23f9b	Add flash-attn support for stable-lm. (#1052 )	2023-10-07 21:12:54 +01:00
Laurent Mazare	d833527fda	Use candle_nn::LSTM in encodec. (#1051 ) * Use candle_nn::LSTM in encodec. * More Encodec implementation. * Decoder implementation.	2023-10-07 19:43:06 +01:00

... 2 3 4 5 6 ...

622 Commits