Commit Graph

1142 Commits

Author SHA1 Message Date
1a90f9d3a6 Cuda implementation for copying data around. 2023-06-23 11:18:29 +01:00
79e4b29c2f Add the reshape method and operation (without grad for now). 2023-06-23 10:51:05 +01:00
c4c6167949 Add the continuous method. 2023-06-23 10:45:20 +01:00
4712dcc2f6 Actually copy the data around in cat (cpu only). 2023-06-23 10:24:02 +01:00
6110db31c9 Add the cat operator (without the storage implementation for now). 2023-06-23 10:13:37 +01:00
bf9e1d1c23 Add the detach method. 2023-06-23 09:19:23 +01:00
3e7cb18d7f Handle tensor transfers between devices in the backprop. 2023-06-23 08:55:34 +01:00
3f79d81b6f Add transposition around arbitrary axis. 2023-06-23 08:51:13 +01:00
27d428af1a Add the backward pass for transpose. 2023-06-23 08:43:05 +01:00
3b550a56dc Transfer tensors between devices. 2023-06-23 08:35:22 +01:00
fc41ccb5bb Add the copy method. 2023-06-23 08:12:52 +01:00
552276749a Only keep track of the graph when needed. 2023-06-22 22:06:24 +01:00
fc83d97b41 Only support the contiguous case for cublas matmul. 2023-06-22 21:39:37 +01:00
2231c717d5 Fix the matmul example. 2023-06-22 21:11:41 +01:00
6463d661d8 Tweaks. 2023-06-22 20:25:14 +01:00
aebffcfc13 Add a matmul cuda example. 2023-06-22 19:44:26 +01:00
0671b8c369 Improve the gemm config. 2023-06-22 19:24:02 +01:00
cc78900922 Start adding the cublas based matmul. 2023-06-22 18:45:10 +01:00
683730c21d Add the cublas handle to the cuda device. 2023-06-22 18:03:53 +01:00
7d9a8ff3f9 Do not ignore errors when cloning the storage. 2023-06-22 16:29:18 +01:00
2f7a072250 Rename as_slice to storage_data and implement the cuda version. 2023-06-22 16:00:22 +01:00
065b7a19c7 Stride support for unary ops. 2023-06-22 15:46:34 +01:00
5b1ab5b687 Support strides in affine. 2023-06-22 15:38:42 +01:00
836ad5f76c Remove one level of indirection for the binary and unary ops. 2023-06-22 15:20:51 +01:00
5276755fb3 Add cuda support for unary ops. 2023-06-22 15:12:59 +01:00
b8f514d9c6 Add more binary kernels. 2023-06-22 14:07:02 +01:00
97fe1fac85 Add a makefile for cleaning the kernels code. 2023-06-22 13:57:51 +01:00
e1eb86db61 Add some first binary op (add). 2023-06-22 13:52:02 +01:00
83d6198009 Simplify the binary kernels. 2023-06-22 13:16:03 +01:00
4b1c3405e9 Add a couple cuda kernels from dfdx. 2023-06-22 12:56:29 +01:00
625e08d6ab Abstract the implementation of Shape. 2023-06-22 12:39:15 +01:00
f052020ba2 Support cuda in to_vec3. 2023-06-22 12:22:51 +01:00
0689d62548 Merge pull request #2 from LaurentMazare/matmul
Adding matmul.
2023-06-22 13:18:57 +02:00
77712d4348 Addressing comments. 2023-06-22 13:13:35 +02:00
449af49b54 Adding size checking when creating a tensor from buffer + shape. 2023-06-22 13:08:57 +02:00
a8b6c848e0 Final updates. 2023-06-22 12:39:33 +02:00
04cf14f35a Moving to gemm and adding matmul backprop.
- Tentative `T` operator.
2023-06-22 12:37:02 +02:00
9ea220fc6e Fixing tokenizers dep. 2023-06-22 12:25:58 +02:00
86e4cbbc3d Adding matmul 2023-06-22 12:25:58 +02:00
ce977b489e Adding matmul? 2023-06-22 12:25:58 +02:00
87a37b3bf3 Retrieve data from the gpu. 2023-06-22 11:01:49 +01:00
083ced4428 Integrate the kernels bits. 2023-06-22 09:59:00 +01:00
1309932933 Polish a bit the kernel loading. 2023-06-22 09:16:43 +01:00
b5f7553b18 Deactivate nightly CI as it's flaky at the moment. 2023-06-22 08:36:33 +01:00
0a758ffa05 Add the fill kernel and use it for 'ones'. 2023-06-22 08:33:32 +01:00
fc26bab3ed Add some specific errors rather than panicking. 2023-06-22 07:51:53 +01:00
db35b31050 Merge pull request #3 from LaurentMazare/cuda
Add Cuda support.
2023-06-21 21:37:54 +01:00
7c46de9584 Check that the tensor is contiguous before applying the kernel. 2023-06-21 21:28:59 +01:00
9834151254 Small improvement to the cuda panic. 2023-06-21 21:25:51 +01:00
304a557d84 Add a dummy module. 2023-06-21 21:16:00 +01:00