candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-17 19:18:50 +00:00

Author	SHA1	Message	Date
Laurent Mazare	ef33df7ae2	No need for the even constraint on vecdot-q40-q80. (#1202 )	2023-10-28 07:23:59 +01:00
Laurent Mazare	e2826e70b3	Add a quantized variant of llama2.c (#1197 ) * Add a quantized variant of llama2.c * Clippy fixes.	2023-10-27 15:34:06 +01:00
Laurent Mazare	7670fe7d1f	neon optimized q8k multiplication. (#1021 ) * neon optimized q8k multiplication. * Bugfixes. * simdification.	2023-10-02 23:26:34 +01:00
Laurent Mazare	a1a5ab8b0a	Neon optimized vecdot (#666 ) * Q5k vecdot. * Add the q3k vecdot. * Q2k vecdot. * Move the quantized model to its own file.	2023-08-29 22:28:46 +01:00
Laurent Mazare	1da71a5da1	Neon optimized version of the q4k vecdot product. (#632 )	2023-08-27 21:30:47 +01:00
Laurent Mazare	9c8d6dbc2a	Neon intrinsics for the q8_0 vecdot. (#604 ) * Neon intrinsics for the q8_0 vecdot. * Get the tests to run with accelerate (with some numerical error failures).	2023-08-25 14:42:18 +01:00
Laurent Mazare	82410995a2	Neon support for quantization. (#519 ) * Skeleton files for neon support of quantization. * SIMD version for q4 vecdot. * Also simdify the q6k multiplication.	2023-08-19 22:07:29 +01:00