Add quantization support for q2k, q3k, q4k and q5k (#524)

* first q2 implementation

* First Q4K and Q5K implementations

* fix `q2k` and `q5k`

* Some first cleanups

* run `clippy` on tests

* finally implement `q3k`

* deactivate `q3k` test on macos

* also disable the test on linux

* Fix floating bits in `q3k` dequantization

* Refactoring pass + reorder quants in file

* `fmt`

* Re-add `src` asserts and redefine `dst`
This commit is contained in:
Lukas Kreussel
2023-08-22 16:04:55 +02:00
committed by GitHub
parent 9bc811a247
commit 352383cbc3
4 changed files with 1111 additions and 456 deletions

File diff suppressed because it is too large Load Diff