candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	a11af79e23	Add a quantized blip model. (#1155 ) * Add a quantized blip model. * Integrate the quantized blip model to the actual example.	2023-10-22 20:33:25 +01:00
Laurent Mazare	df2f89b6cf	Add some KV cache to blip. (#1150 ) * Add some KV cache to blip. * Mention BLIP in the readme.	2023-10-22 09:44:48 +01:00
Laurent Mazare	5b32c2a41e	Remove the unused pragma and properly apply the bias. (#1147 )	2023-10-22 06:47:40 +01:00
Laurent Mazare	3115fe42e4	Blip attention mask + readme (#1146 ) * Add the attention mask to the blip model. * Add a readme.	2023-10-21 22:44:13 +01:00
Laurent Mazare	2531b13bf8	Blip fixes (#1145 ) * Some fixes for the blip example. * Stop generating on sep tokens. * Clippy fixes. * rustfmt.	2023-10-21 21:34:48 +01:00
Laurent Mazare	0d9bb4eb18	Add the blip example. (#1144 ) * Add the blip example. * Tweak the example. * Implement the cross-attn logic. * Fix some shape mismatches. * Get some logits out. * Get some caption to be generated.	2023-10-21 20:05:02 +01:00
Laurent Mazare	94e3373883	Blip forward pass (#1141 ) * More forward methods for the blip model. * Blipping continues.	2023-10-21 10:19:23 +01:00
Laurent Mazare	34d9e91748	Add the blip image captioning model (#1140 ) * Blip text model. * Blip vision bits. * Blippity. * More blip.	2023-10-20 22:09:11 +01:00
Laurent Mazare	55351ef57d	Add some vision transformers models (#1132 ) * Start adding vision-transformers. * Add self-attn. * More vision transformers. * vit-vit. * Add the actual vit model. * Add the example code for the vision transformers.	2023-10-19 22:24:18 +01:00
Laurent Mazare	cd53c472df	Support ResNet 50/101/152. (#1130 )	2023-10-19 10:48:31 +01:00
Laurent Mazare	8e773cc0c6	Experiment with resnet (#1128 ) * Add some preliminary support for resnet. * Add an actual resnet example.	2023-10-19 09:25:03 +01:00
Laurent Mazare	902d0b9166	More model cloning. (#1126 ) * More model cloning. * More cloning on quantized models.	2023-10-18 21:55:46 +01:00
Laurent Mazare	185b54a33b	Make some model cloneable. (#1125 )	2023-10-18 19:30:47 +01:00
Laurent Mazare	86e7d539d2	Add the quantized mpt model. (#1123 ) * Add the quantized mpt model. * Support the quantized model for replit-code.	2023-10-18 16:29:38 +01:00
Laurent Mazare	cb034506cd	Remove the unused pragma in mpt. (#1122 )	2023-10-18 15:47:50 +01:00
Laurent Mazare	767a6578f1	MPT alibi fixes. (#1120 ) * MPT alibi fixes. * Some more fixes. * Finally get the model to return some sensible outputs. * Add a readme.	2023-10-18 10:58:05 +01:00
Laurent Mazare	2cd745a97c	MPT fixes. (#1117 ) * MPT fixes. * Another couple fixes. * Another shape fix.	2023-10-17 21:53:31 +01:00
Laurent Mazare	a72b50e2c0	Build alibi bias. (#1115 ) * Build alibi bias. * Apply the alibi attention bias. * Add the replit-code example.	2023-10-17 20:41:37 +01:00
Laurent Mazare	872c3f14b0	Add the MPT model. (#1114 ) * Add the MPT model. * Add ffn and block. * Forward pass for the mpt block. * Repeat-kv.	2023-10-17 16:06:48 +01:00
Laurent Mazare	af67672207	Add support for Puffin-Phi-v2. (#1110 ) * Add support for Puffin-Phi-v2. * Tweak the file name. * Support the config for puffin-phi-v2. * Update the readme.	2023-10-16 20:54:21 +01:00
Laurent Mazare	89b525b5e7	Convmixer (#1073 ) * Only optimize float tensors. * Use full tensors for zeros and ones. * Add a benchmark for the matmul slowness. * Add the convmixer model. * Proper adaptive pooling.	2023-10-11 18:24:32 +01:00
Laurent Mazare	bc3351bce4	Tracing for StableLM and quantized StableLM. (#1068 )	2023-10-10 08:09:25 +02:00
Laurent Mazare	392fe02fba	Move the common quantized-nn code to a shared module. (#1063 )	2023-10-09 06:22:22 +01:00
Laurent Mazare	59ab6d7832	Quantized version of StableLM. (#1058 ) * Quantized version of StableLM. * Adapt the stable-lm example to support quantizsed. * Use some separate hub repo. * Another repo name tweak.	2023-10-08 15:42:38 +01:00
Laurent Mazare	783735cf22	Use softmax-last-dim where possible. (#1057 )	2023-10-08 13:16:42 +01:00
Laurent Mazare	2e5fb0b251	Do not use the kv-cache on external key-value states. (#1054 )	2023-10-07 22:37:19 +01:00
Laurent Mazare	823fe23f9b	Add flash-attn support for stable-lm. (#1052 )	2023-10-07 21:12:54 +01:00
Laurent Mazare	aa53368aeb	Better control on the optional dequantization in QMatMul (#1049 ) * Cosmetic change to the quantized whisper model. * Fix the dequantization. * Add the dequantize all variable.	2023-10-07 10:16:18 +01:00
Laurent Mazare	d5f7267087	Add the stable-lm example. (#1046 ) * Add the stable-lm example. * Get stable-lm to generate some proper text.	2023-10-06 19:20:35 +01:00
Laurent Mazare	b0442eff8a	Sketch the stable-lm model. (#1045 )	2023-10-06 18:19:06 +01:00
Laurent Mazare	4631c48273	Remove some todos. (#1042 )	2023-10-05 22:42:20 +01:00
Juarez Bochi	f47bd9bab5	Delete invalid comment (#1038 )	2023-10-05 19:28:08 +01:00
Laurent Mazare	089fc3b584	Improve the quantized whisper setup. (#1018 ) * Improve the quantized whisper setup. * Fix the config file paths. * Use the standard matmul where possible.	2023-10-02 17:17:46 +01:00
Laurent Mazare	e04c789230	Add a quantized variant of whisper (#1017 ) * Add the quantized-whisper model. * Quantized the whisper model. * Adapt the whisper example to handle quantization. * Add the quantized flag. * Load the proper weights.	2023-10-02 14:59:53 +01:00
Laurent Mazare	096dee7073	Bump the version to 0.3.0. (#1014 ) * Bump the version to 0.3.0. * Changelog update.	2023-10-01 13:51:57 +01:00
Laurent Mazare	deee7612da	Quantized version of mistral. (#1009 ) * Quantized version of mistral. * Integrate the quantized mistral variant. * Use the quantized weight files. * Tweak the quantization command. * Fix the dtype when computing the rotary embeddings. * Update the readme with the quantized version. * Fix the decoding of the remaining tokens.	2023-09-30 18:25:47 +01:00
Laurent Mazare	4021272875	Use flash-attn for mistral. (#1004 )	2023-09-30 12:15:10 +01:00
Laurent Mazare	6203ced495	Add negative prompts to segment-anything. (#1000 )	2023-09-30 06:17:42 +01:00
Laurent Mazare	d188d6a764	Fix the multiple points case for sam. (#998 )	2023-09-29 22:39:43 +02:00
Laurent Mazare	53510ce427	Use a silu activation in mistral. (#991 )	2023-09-29 07:06:54 +01:00
Laurent Mazare	23b3576c47	Add the sliding window. (#986 )	2023-09-28 17:26:33 +01:00
Laurent Mazare	716ab2ccdc	Mistral gpu fix (#985 ) * Add the mistral example. * Use the two model files. * Adjust the dtype. * Tweak the weight paths. * Remove the end of text token. * Get the mistral model to generate some text. * Fix when running on the gpu. * More gpu fixes.	2023-09-28 16:38:13 +01:00
Laurent Mazare	ada8851a23	Add the mistral example. (#984 ) * Add the mistral example. * Use the two model files. * Adjust the dtype. * Tweak the weight paths. * Remove the end of text token. * Get the mistral model to generate some text.	2023-09-28 16:19:18 +01:00
Laurent Mazare	c05a348e36	Add the Mistral 7b model (#983 ) * Start sketching the mistral 7b model. * Add the kv cache. * Add the decoder layer. * Add the mistral model. * Rotary embeddings. * Add the attention mask.	2023-09-28 14:29:41 +01:00
Laurent Mazare	ce0a4e3a85	Use the gelu-erf activation. (#969 )	2023-09-26 22:30:21 +01:00
Laurent Mazare	1fcac4afed	Expose a function to clear the KV cache on mixformers. (#964 )	2023-09-26 05:41:07 +01:00
Laurent Mazare	a36d883254	Use a single flag for the point argument. (#958 )	2023-09-25 12:53:24 +01:00
GeauxEric	7f2bbcf746	[segment-anything] Support multi-point as the prompt input (#945 ) * [sam] Support multi-point prompts * [segment-anything] Pass points by reference * [segment-anything] Update example code and image * Fix clippy lint. --------- Co-authored-by: Yun Ding <yunding@nvidia.com> Co-authored-by: laurent <laurent.mazare@gmail.com>	2023-09-25 12:14:10 +01:00
Laurent Mazare	0007ae9c11	Add the quantized mixformer model. (#953 ) * Add the quantized mixformer model. * Add the quantized option in the phi example.	2023-09-24 15:03:48 +01:00
Laurent Mazare	e15862cfdb	Shared the quantized var-builder code. (#952 ) * Shared the quantized var-builder code. * Fix compilation.	2023-09-24 12:55:07 +01:00

1 2 3

123 Commits