Laurent Mazare
cd29c7ccd4
More ggml cuda kernels ( #1977 )
...
* Add more cuda kernels for quantized matmul.
* Add the vec-dot bits.
* Expose the quantized matmul-vec kernels.
* Also include the quantize-q8-1 kernel.
* Glue code for the q8-1 quantization.
* mm-vec product via q8-1 quantization.
* Add a test.
* Add a mm test.
* Get the test to return some sensible results.
* Also test dmmv.
* Fix the launch params.
* Allow for tweaking the force_dmmv parameter while it's experimental.
2024-04-01 00:15:48 +02:00
..
2023-12-06 17:46:37 +01:00
2023-09-23 21:26:03 +01:00
2024-01-17 10:27:58 +01:00
2024-02-09 17:36:50 +01:00
2024-03-28 13:44:12 +01:00
2024-03-16 22:25:46 +01:00
2024-03-16 22:25:46 +01:00
2024-02-25 18:11:47 +01:00
2024-03-16 22:25:46 +01:00
2023-11-24 15:09:14 +00:00
2024-03-16 22:25:46 +01:00
2024-03-16 22:25:46 +01:00
2024-03-18 11:19:46 +01:00
2023-12-26 09:44:30 +01:00
2024-03-13 21:41:36 +01:00
2023-10-26 20:00:50 +01:00
2024-02-22 12:04:33 +01:00
2024-02-22 10:22:03 +01:00
2023-12-26 09:44:30 +01:00
2024-02-11 17:04:57 +01:00
2024-02-11 17:04:57 +01:00
2023-10-31 08:47:44 +00:00
2024-03-09 11:06:04 +01:00
2024-03-24 08:04:04 +01:00
2024-02-15 16:47:33 +01:00
2023-10-29 07:53:09 +01:00
2024-03-16 22:25:46 +01:00
2024-03-31 09:32:50 +02:00
2024-02-28 09:22:33 +01:00
2024-02-04 11:59:47 +01:00
2024-01-17 10:27:58 +01:00
2024-04-01 00:15:48 +02:00
2024-03-30 15:49:37 +01:00
2024-03-28 23:24:46 +01:00
2024-03-18 21:43:31 +01:00
2024-01-17 10:27:58 +01:00
2024-03-16 22:25:46 +01:00
2024-03-16 22:25:46 +01:00
2024-03-09 11:21:48 +01:00
2024-03-21 10:56:41 +01:00
2023-09-30 06:17:42 +01:00
2024-03-21 21:08:07 +01:00
2024-03-20 13:04:36 +01:00
2024-02-28 21:02:41 +01:00
2024-03-29 18:09:29 +01:00
2024-02-10 16:14:50 +01:00
2024-03-16 22:25:46 +01:00
2024-03-16 22:25:46 +01:00
2024-03-21 12:54:09 +01:00
2024-02-12 18:01:21 +01:00
2023-10-09 19:49:57 +01:00
2024-02-06 12:03:53 +01:00
2024-02-08 16:48:47 +01:00
2024-03-21 10:56:41 +01:00
2023-11-06 22:44:58 +01:00