candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
laurent	1a90f9d3a6	Cuda implementation for copying data around.	2023-06-23 11:18:29 +01:00
laurent	79e4b29c2f	Add the reshape method and operation (without grad for now).	2023-06-23 10:51:05 +01:00
laurent	c4c6167949	Add the continuous method.	2023-06-23 10:45:20 +01:00
laurent	4712dcc2f6	Actually copy the data around in cat (cpu only).	2023-06-23 10:24:02 +01:00
laurent	6110db31c9	Add the cat operator (without the storage implementation for now).	2023-06-23 10:13:37 +01:00
laurent	bf9e1d1c23	Add the detach method.	2023-06-23 09:19:23 +01:00
laurent	3e7cb18d7f	Handle tensor transfers between devices in the backprop.	2023-06-23 08:55:34 +01:00
laurent	3f79d81b6f	Add transposition around arbitrary axis.	2023-06-23 08:51:13 +01:00
laurent	27d428af1a	Add the backward pass for transpose.	2023-06-23 08:43:05 +01:00
laurent	3b550a56dc	Transfer tensors between devices.	2023-06-23 08:35:22 +01:00
laurent	fc41ccb5bb	Add the copy method.	2023-06-23 08:12:52 +01:00
laurent	552276749a	Only keep track of the graph when needed.	2023-06-22 22:06:24 +01:00
laurent	fc83d97b41	Only support the contiguous case for cublas matmul.	2023-06-22 21:39:37 +01:00
laurent	2231c717d5	Fix the matmul example.	2023-06-22 21:11:41 +01:00
laurent	6463d661d8	Tweaks.	2023-06-22 20:25:14 +01:00
laurent	aebffcfc13	Add a matmul cuda example.	2023-06-22 19:44:26 +01:00
laurent	0671b8c369	Improve the gemm config.	2023-06-22 19:24:02 +01:00
laurent	cc78900922	Start adding the cublas based matmul.	2023-06-22 18:45:10 +01:00
laurent	683730c21d	Add the cublas handle to the cuda device.	2023-06-22 18:03:53 +01:00
laurent	7d9a8ff3f9	Do not ignore errors when cloning the storage.	2023-06-22 16:29:18 +01:00
laurent	2f7a072250	Rename as_slice to storage_data and implement the cuda version.	2023-06-22 16:00:22 +01:00
laurent	065b7a19c7	Stride support for unary ops.	2023-06-22 15:46:34 +01:00
laurent	5b1ab5b687	Support strides in affine.	2023-06-22 15:38:42 +01:00
laurent	836ad5f76c	Remove one level of indirection for the binary and unary ops.	2023-06-22 15:20:51 +01:00
laurent	5276755fb3	Add cuda support for unary ops.	2023-06-22 15:12:59 +01:00
laurent	b8f514d9c6	Add more binary kernels.	2023-06-22 14:07:02 +01:00
laurent	97fe1fac85	Add a makefile for cleaning the kernels code.	2023-06-22 13:57:51 +01:00
laurent	e1eb86db61	Add some first binary op (add).	2023-06-22 13:52:02 +01:00
laurent	83d6198009	Simplify the binary kernels.	2023-06-22 13:16:03 +01:00
laurent	4b1c3405e9	Add a couple cuda kernels from dfdx.	2023-06-22 12:56:29 +01:00
laurent	625e08d6ab	Abstract the implementation of Shape.	2023-06-22 12:39:15 +01:00
laurent	f052020ba2	Support cuda in to_vec3.	2023-06-22 12:22:51 +01:00
Nicolas Patry	0689d62548	Merge pull request #2 from LaurentMazare/matmul Adding matmul.	2023-06-22 13:18:57 +02:00
Nicolas Patry	77712d4348	Addressing comments.	2023-06-22 13:13:35 +02:00
Nicolas Patry	449af49b54	Adding size checking when creating a tensor from buffer + shape.	2023-06-22 13:08:57 +02:00
Nicolas Patry	a8b6c848e0	Final updates.	2023-06-22 12:39:33 +02:00
Nicolas Patry	04cf14f35a	Moving to `gemm` and adding matmul backprop. - Tentative `T` operator.	2023-06-22 12:37:02 +02:00
Nicolas Patry	9ea220fc6e	Fixing tokenizers dep.	2023-06-22 12:25:58 +02:00
Nicolas Patry	86e4cbbc3d	Adding matmul	2023-06-22 12:25:58 +02:00
Nicolas Patry	ce977b489e	Adding matmul?	2023-06-22 12:25:58 +02:00
laurent	87a37b3bf3	Retrieve data from the gpu.	2023-06-22 11:01:49 +01:00
laurent	083ced4428	Integrate the kernels bits.	2023-06-22 09:59:00 +01:00
laurent	1309932933	Polish a bit the kernel loading.	2023-06-22 09:16:43 +01:00
laurent	b5f7553b18	Deactivate nightly CI as it's flaky at the moment.	2023-06-22 08:36:33 +01:00
laurent	0a758ffa05	Add the fill kernel and use it for 'ones'.	2023-06-22 08:33:32 +01:00
laurent	fc26bab3ed	Add some specific errors rather than panicking.	2023-06-22 07:51:53 +01:00
Laurent Mazare	db35b31050	Merge pull request #3 from LaurentMazare/cuda Add Cuda support.	2023-06-21 21:37:54 +01:00
laurent	7c46de9584	Check that the tensor is contiguous before applying the kernel.	2023-06-21 21:28:59 +01:00
laurent	9834151254	Small improvement to the cuda panic.	2023-06-21 21:25:51 +01:00
laurent	304a557d84	Add a dummy module.	2023-06-21 21:16:00 +01:00

... 19 20 21 22 23

1142 Commits