candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-18 11:37:11 +00:00

Author	SHA1	Message	Date
Laurent Mazare	d9904a3baf	Update to cudarc 0.14 (breaking change). (#2858 ) * Start updating to cudarc 0.14. * Adapt a couple more things. * And a couple more fixes. * More tweaks. * And a couple more fixes. * Bump the major version number. * Proper module system for the cuda kernels. * Proper ptx loading. * Launch the sort kernel. * Custom op. * Start using the builder pattern. * More builder. * More builder. * Get candle-core to compile. * Get the tests to pass. * Get candle-nn to work too. * Support for custom cuda functions. * cudnn fixes. * Get flash attn to run. * Switch the crate versions to be alpha. * Bump the ug dependency.	2025-04-03 09:12:19 +02:00
Laurent Mazare	e38e2a85dd	Fix a cuda warning. (#2693 )	2024-12-31 09:06:10 +01:00
Laurent Mazare	b13a82a438	Separate quantized phi-3 implementation. (#2157 ) * Separate quantized phi-3 implementation. * Integrate the quantized phi3 model.= * Small fixes, get the generation to work properly. * Keep the old llama implementation around. * Change the default.	2024-05-04 10:14:57 +02:00
Laurent Mazare	805f3be8e1	Add a sort function. (#2134 )	2024-04-28 08:18:04 +02:00
Laurent Mazare	96a48e5cc4	Add argsort. (#2132 ) * Add the argsort cuda kernels. * CPU version of arg-sort. * Hook the cuda kernel + rework the cpu bits. * Add some dedicated test. * Working cuda kernel. * Metal kernel. * Metal adjustments. * Bugfix. * Use the fast rope in qwen. * Rework the expert selection in qwen.	2024-04-27 20:17:35 +02:00