mirror of
https://github.com/huggingface/candle.git
synced 2025-06-18 19:47:12 +00:00
Sketch a simd128 optimized q4k vecdot. (#977)
* Sketch a simd128 optimized q4k vecdot. * Simdify. * More quantization optimizations. * Again more simdification. * Simdify the splitting loop.
This commit is contained in:
@ -1132,6 +1132,9 @@ impl GgmlType for BlockQ4K {
|
||||
#[cfg(target_feature = "neon")]
|
||||
return super::neon::vec_dot_q4k_q8k(n, xs, ys);
|
||||
|
||||
#[cfg(target_feature = "simd128")]
|
||||
return super::simd128::vec_dot_q4k_q8k(n, xs, ys);
|
||||
|
||||
if n % QK_K != 0 {
|
||||
crate::bail!("vec_dot_q4k_q8k: {n} is not divisible by {QK_K}")
|
||||
}
|
||||
|
Reference in New Issue
Block a user