Sketch a simd128 optimized q4k vecdot. (#977)

* Sketch a simd128 optimized q4k vecdot.

* Simdify.

* More quantization optimizations.

* Again more simdification.

* Simdify the splitting loop.
This commit is contained in:
Laurent Mazare
2023-09-27 20:19:38 +01:00
committed by GitHub
parent 667f01c173
commit 9cb110c44c
3 changed files with 103 additions and 1 deletions

View File

@ -1132,6 +1132,9 @@ impl GgmlType for BlockQ4K {
#[cfg(target_feature = "neon")]
return super::neon::vec_dot_q4k_q8k(n, xs, ys);
#[cfg(target_feature = "simd128")]
return super::simd128::vec_dot_q4k_q8k(n, xs, ys);
if n % QK_K != 0 {
crate::bail!("vec_dot_q4k_q8k: {n} is not divisible by {QK_K}")
}