candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-17 02:58:50 +00:00

Author	SHA1	Message	Date
Laurent Mazare	d73ca3d28e	Line up the llama.cpp implementation with the candle one. (#518 ) * Separate the prompt stats from the post-prompt ones in the quantized example. * Slightly nicer output printing. * Line up with the llama.cpp implementation.	2023-08-19 20:12:07 +01:00
Laurent Mazare	c78ce76501	Add a simple Module trait and implement it for the various nn layers (#500 ) * Start adding the module trait. * Use the module trait. * Implement module for qmatmul.	2023-08-18 09:38:22 +01:00
Laurent Mazare	557b2c28dd	Q6K quantization (#495 ) * Print the detected arch options. * Add the q6k quantization. * Add a currently broken test. * Bugfix. * Bugfix. * Another bugfix. * Another bugfix + get the test to work.	2023-08-17 22:22:57 +01:00
Laurent Mazare	5f30c1e1e0	Add the whisper small model. (#490 )	2023-08-17 15:48:34 +01:00
Laurent Mazare	ad7c53953b	Add a verbose-prompt mode, similar to llama.cpp. (#489 )	2023-08-17 15:26:44 +01:00
Laurent Mazare	d32e8199cd	Layer norm tweaks (#482 ) * Add some options to make layer-norm more configurable. * Add the rms-norm variant. * Replace the RmsNorm with the shared bits.	2023-08-17 10:07:13 +01:00
Laurent Mazare	d99cac3ec3	Move the avx specific bits to a separate file. (#481 )	2023-08-17 09:01:06 +01:00