candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-17 19:18:50 +00:00

Author	SHA1	Message	Date
Laurent Mazare	612f5b8156	Make more models cloneable. (#1203 )	2023-10-28 07:43:08 +01:00
Laurent Mazare	392fe02fba	Move the common quantized-nn code to a shared module. (#1063 )	2023-10-09 06:22:22 +01:00
Laurent Mazare	783735cf22	Use softmax-last-dim where possible. (#1057 )	2023-10-08 13:16:42 +01:00
Laurent Mazare	2e5fb0b251	Do not use the kv-cache on external key-value states. (#1054 )	2023-10-07 22:37:19 +01:00
Juarez Bochi	f47bd9bab5	Delete invalid comment (#1038 )	2023-10-05 19:28:08 +01:00
Laurent Mazare	0007ae9c11	Add the quantized mixformer model. (#953 ) * Add the quantized mixformer model. * Add the quantized option in the phi example.	2023-09-24 15:03:48 +01:00
Laurent Mazare	e15862cfdb	Shared the quantized var-builder code. (#952 ) * Shared the quantized var-builder code. * Fix compilation.	2023-09-24 12:55:07 +01:00
Radamés Ajna	19e52e5007	T5 Wasm (#918 ) * init t5 wasm model * split workers for each model * clean up * add some ui * readme * index * typo * remove cache param, clear_kv_cache * add max_length as param * add model tasks option to ui * add method to load quantized gguf from buffer * Add quantized wasm module * add quantized models to UI, dynamic import wasms * link to quantized * fix copy * fix ModelEncoder * fix README.md	2023-09-22 15:31:10 +01:00
Laurent Mazare	3b557765e8	T5 quantized example (#922 ) * Load gguf files for the quantized t5. * Add the quantized t5 example. * Allow for loading local files. * Add some support for quantizing safetensor files. * Transpose before quantizing. * Quantized t5. * Retrieve the weights from the hub.	2023-09-21 12:33:15 +01:00
Laurent Mazare	2619c4307f	Add a quantized version of the t5 model. (#921 )	2023-09-21 11:13:39 +01:00

10 Commits