candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
laurent	fc83d97b41	Only support the contiguous case for cublas matmul.	2023-06-22 21:39:37 +01:00
laurent	2231c717d5	Fix the matmul example.	2023-06-22 21:11:41 +01:00
laurent	6463d661d8	Tweaks.	2023-06-22 20:25:14 +01:00
laurent	aebffcfc13	Add a matmul cuda example.	2023-06-22 19:44:26 +01:00
laurent	0671b8c369	Improve the gemm config.	2023-06-22 19:24:02 +01:00
laurent	cc78900922	Start adding the cublas based matmul.	2023-06-22 18:45:10 +01:00
laurent	683730c21d	Add the cublas handle to the cuda device.	2023-06-22 18:03:53 +01:00
laurent	7d9a8ff3f9	Do not ignore errors when cloning the storage.	2023-06-22 16:29:18 +01:00
laurent	2f7a072250	Rename as_slice to storage_data and implement the cuda version.	2023-06-22 16:00:22 +01:00
laurent	065b7a19c7	Stride support for unary ops.	2023-06-22 15:46:34 +01:00
laurent	5b1ab5b687	Support strides in affine.	2023-06-22 15:38:42 +01:00
laurent	836ad5f76c	Remove one level of indirection for the binary and unary ops.	2023-06-22 15:20:51 +01:00
laurent	5276755fb3	Add cuda support for unary ops.	2023-06-22 15:12:59 +01:00
laurent	b8f514d9c6	Add more binary kernels.	2023-06-22 14:07:02 +01:00
laurent	97fe1fac85	Add a makefile for cleaning the kernels code.	2023-06-22 13:57:51 +01:00
laurent	e1eb86db61	Add some first binary op (add).	2023-06-22 13:52:02 +01:00
laurent	83d6198009	Simplify the binary kernels.	2023-06-22 13:16:03 +01:00
laurent	4b1c3405e9	Add a couple cuda kernels from dfdx.	2023-06-22 12:56:29 +01:00
laurent	625e08d6ab	Abstract the implementation of Shape.	2023-06-22 12:39:15 +01:00
laurent	f052020ba2	Support cuda in to_vec3.	2023-06-22 12:22:51 +01:00
Nicolas Patry	0689d62548	Merge pull request #2 from LaurentMazare/matmul Adding matmul.	2023-06-22 13:18:57 +02:00
Nicolas Patry	77712d4348	Addressing comments.	2023-06-22 13:13:35 +02:00
Nicolas Patry	449af49b54	Adding size checking when creating a tensor from buffer + shape.	2023-06-22 13:08:57 +02:00
Nicolas Patry	a8b6c848e0	Final updates.	2023-06-22 12:39:33 +02:00
Nicolas Patry	04cf14f35a	Moving to `gemm` and adding matmul backprop. - Tentative `T` operator.	2023-06-22 12:37:02 +02:00
Nicolas Patry	9ea220fc6e	Fixing tokenizers dep.	2023-06-22 12:25:58 +02:00
Nicolas Patry	86e4cbbc3d	Adding matmul	2023-06-22 12:25:58 +02:00
Nicolas Patry	ce977b489e	Adding matmul?	2023-06-22 12:25:58 +02:00
laurent	87a37b3bf3	Retrieve data from the gpu.	2023-06-22 11:01:49 +01:00
laurent	083ced4428	Integrate the kernels bits.	2023-06-22 09:59:00 +01:00
laurent	1309932933	Polish a bit the kernel loading.	2023-06-22 09:16:43 +01:00
laurent	b5f7553b18	Deactivate nightly CI as it's flaky at the moment.	2023-06-22 08:36:33 +01:00
laurent	0a758ffa05	Add the fill kernel and use it for 'ones'.	2023-06-22 08:33:32 +01:00
laurent	fc26bab3ed	Add some specific errors rather than panicking.	2023-06-22 07:51:53 +01:00
Laurent Mazare	db35b31050	Merge pull request #3 from LaurentMazare/cuda Add Cuda support.	2023-06-21 21:37:54 +01:00
laurent	7c46de9584	Check that the tensor is contiguous before applying the kernel.	2023-06-21 21:28:59 +01:00
laurent	9834151254	Small improvement to the cuda panic.	2023-06-21 21:25:51 +01:00
laurent	304a557d84	Add a dummy module.	2023-06-21 21:16:00 +01:00
laurent	97d9142dee	Add a first kernel.	2023-06-21 20:48:22 +01:00
laurent	fcb4e6b84f	Use a reference for the device.	2023-06-21 19:55:57 +01:00
laurent	deb6091099	Use a type alias for cuda errors.	2023-06-21 19:50:00 +01:00
laurent	71735c7a02	Move the data between the host and the device.	2023-06-21 19:43:25 +01:00
laurent	c654ecdb16	Add a specific example for cuda.	2023-06-21 18:56:04 +01:00
laurent	2bfe8f18ab	Start adding support for cuda.	2023-06-21 18:11:56 +01:00
laurent	7c317f9611	cuda is not available on the CI so deactivate it.	2023-06-21 14:50:52 +01:00
laurent	7adffafeda	Abstract the gradient storage.	2023-06-21 14:29:48 +01:00
laurent	68f525f321	Move more bits to the backend part.	2023-06-21 10:34:51 +01:00
laurent	eb52b9b343	Move the cpu backend specific bits apart.	2023-06-21 10:25:56 +01:00
laurent	b3eb57cd0a	Avoid some duplication using a macro + add some basic example to make debugging easier.	2023-06-21 10:08:41 +01:00
laurent	8cde0c5478	Add some skeleton code for GPU support.	2023-06-21 09:13:57 +01:00

... 43 44 45 46 47

2330 Commits