candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-20 12:06:35 +00:00

Author	SHA1	Message	Date
drbh	dece37c6f4	feat: implement VGG13, VGG16 and VGG19 (#1211 ) * feat: implement VGG13, VGG16 and VGG19 * Cosmetic fixes. * More cosmetic tweaks + avoid re-loading the weights on each final layer. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>	2023-10-29 06:10:23 +00:00
Laurent Mazare	95a857cf57	Move the llama2-c model in transformers. (#1205 )	2023-10-28 16:51:19 +01:00
Laurent Mazare	c8face3f95	Add the relu2 and relu6 activations. (#1201 )	2023-10-27 20:51:16 +01:00
Laurent Mazare	5f20697918	Add the jina-bert embeddings model. (#1187 ) * Add the jina-bert model. * Use alibi. * Remove the unused pragma. * Recompute the alibi embeddings. * Generate the token type ids. * Use the module trait. * Add the jina-bert example. * DType fix. * Get the inference to work.	2023-10-26 16:54:36 +01:00
Laurent Mazare	a11af79e23	Add a quantized blip model. (#1155 ) * Add a quantized blip model. * Integrate the quantized blip model to the actual example.	2023-10-22 20:33:25 +01:00
Laurent Mazare	34d9e91748	Add the blip image captioning model (#1140 ) * Blip text model. * Blip vision bits. * Blippity. * More blip.	2023-10-20 22:09:11 +01:00
Laurent Mazare	55351ef57d	Add some vision transformers models (#1132 ) * Start adding vision-transformers. * Add self-attn. * More vision transformers. * vit-vit. * Add the actual vit model. * Add the example code for the vision transformers.	2023-10-19 22:24:18 +01:00
Laurent Mazare	8e773cc0c6	Experiment with resnet (#1128 ) * Add some preliminary support for resnet. * Add an actual resnet example.	2023-10-19 09:25:03 +01:00
Laurent Mazare	86e7d539d2	Add the quantized mpt model. (#1123 ) * Add the quantized mpt model. * Support the quantized model for replit-code.	2023-10-18 16:29:38 +01:00
Laurent Mazare	872c3f14b0	Add the MPT model. (#1114 ) * Add the MPT model. * Add ffn and block. * Forward pass for the mpt block. * Repeat-kv.	2023-10-17 16:06:48 +01:00
Laurent Mazare	89b525b5e7	Convmixer (#1073 ) * Only optimize float tensors. * Use full tensors for zeros and ones. * Add a benchmark for the matmul slowness. * Add the convmixer model. * Proper adaptive pooling.	2023-10-11 18:24:32 +01:00
Laurent Mazare	59ab6d7832	Quantized version of StableLM. (#1058 ) * Quantized version of StableLM. * Adapt the stable-lm example to support quantizsed. * Use some separate hub repo. * Another repo name tweak.	2023-10-08 15:42:38 +01:00
Laurent Mazare	b0442eff8a	Sketch the stable-lm model. (#1045 )	2023-10-06 18:19:06 +01:00
Laurent Mazare	deee7612da	Quantized version of mistral. (#1009 ) * Quantized version of mistral. * Integrate the quantized mistral variant. * Use the quantized weight files. * Tweak the quantization command. * Fix the dtype when computing the rotary embeddings. * Update the readme with the quantized version. * Fix the decoding of the remaining tokens.	2023-09-30 18:25:47 +01:00
Laurent Mazare	c05a348e36	Add the Mistral 7b model (#983 ) * Start sketching the mistral 7b model. * Add the kv cache. * Add the decoder layer. * Add the mistral model. * Rotary embeddings. * Add the attention mask.	2023-09-28 14:29:41 +01:00
Laurent Mazare	0007ae9c11	Add the quantized mixformer model. (#953 ) * Add the quantized mixformer model. * Add the quantized option in the phi example.	2023-09-24 15:03:48 +01:00
Laurent Mazare	b54acfa3d0	Tracing for the phi model (#936 ) * Add some tracing bits to mixformers. * Add the missing file. * Add the conv2d layer to with-tracing. * Improve the tracing usage.	2023-09-23 09:19:34 +01:00
Laurent Mazare	a46b1b4657	Mixformer (#929 ) * Sketch the mixformer model. * More modeling code. * More mixformers. * MixFormer creation. * More mixformers.	2023-09-22 16:17:14 +01:00
Laurent Mazare	2619c4307f	Add a quantized version of the t5 model. (#921 )	2023-09-21 11:13:39 +01:00
Laurent Mazare	286f01db14	Start adding the Wuerstchen diffusion pipeline (#843 ) * Wuerstchen common bits. * Add the prior layer. * Start adding diffnext.	2023-09-14 10:56:07 +01:00
Juarez Bochi	9daa6dbe87	Extract T5 module and add main function to use it (#829 ) * Extract t5 out of musicgen * Add main for t5 module	2023-09-13 07:14:05 +01:00
Laurent Mazare	d7b9fec849	Move the stable-diffusion modeling code so that it's easier to re-use. (#812 )	2023-09-11 11:45:57 +01:00
Laurent Mazare	35f72514f5	Move more models to candle-transformers (#796 ) * Move dinov2. * Move efficientnet. * Move the quantized llama model. * Move segment-anything.	2023-09-10 10:20:18 +01:00
Laurent Mazare	d3f05eae8c	Move some models to candle-transformers so that it's easier to re-use. (#794 ) * Move some models to candle-transformers so that they can be shared. * Also move falcon. * Move Llama. * Move whisper (partial).	2023-09-10 09:40:27 +01:00
Laurent Mazare	ba35d895e7	Sketch the candle-transformers crate. (#147 ) * Sketch the candle-transformers crate. * Format the empty files.	2023-07-12 13:49:31 +01:00

1 2

75 Commits