candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	bc3be6f9b0	Add the elu cuda kernel. (#114 )	2023-07-10 07:57:01 +01:00
Laurent Mazare	c187f347bf	Make it easier to use whisper samples from the repo. (#112 ) * Make it easier to use samples from the repo. * Use f32 for accumulation in the f16/bf16 kernels.	2023-07-08 18:48:27 +01:00
Laurent Mazare	eb64ad0d4d	Cuda kernel for the conv1d op (#111 ) * Boilerplate code for conv1d. * Boilerplate code for conv1d. * More boilerplate for conv1d. * Conv1d work. * Get the conv1d cuda kernel to work. * Conv1d support when no batch dim.	2023-07-08 18:13:25 +01:00
Laurent Mazare	e676f85f00	Sketch a fast cuda kernel for reduce-sum. (#109 ) * Sketch a fast cuda kernel for reduce-sum. * Sketch the rust support code for the fast sum kernel. * More work on the fast kernel. * Add some testing ground. * A couple fixes for the fast sum kernel.	2023-07-08 12:43:56 +01:00
Laurent Mazare	c71a38deb7	Tweak the include order to include math.h first. (#100 )	2023-07-07 06:47:25 +01:00
Laurent Mazare	f114394456	Include the math.h file to get access to constants. (#99 )	2023-07-07 06:42:57 +01:00
laurent	9784d1ed9f	Minor tweaks.	2023-07-03 18:31:55 +01:00
laurent	313fa022a5	Bugfix: remove the u8/bf16 conversion kernel as it is ambiguous.	2023-06-30 10:43:32 +01:00
laurent	8ad47907f3	Add the kernels.	2023-06-30 10:26:56 +01:00
laurent	6486a6d7b2	Avoid some cast kernels.	2023-06-29 23:23:44 +01:00
laurent	ec79fc43f2	Add the bf16 cuda kernels.	2023-06-29 23:12:02 +01:00
laurent	1ce3843cab	Add the relu op.	2023-06-28 09:38:54 +01:00
laurent	380d61e990	Fix two cuda bugs (matmul and where_cond).	2023-06-27 11:31:04 +01:00
Nicolas Patry	d7f729fb8f	Refactor the hierarchy.	2023-06-27 11:57:27 +02:00

1 2

64 Commits