|
4c8931d2e4
|
More u32 support.
|
2023-06-23 14:54:03 +01:00 |
|
|
92da45879c
|
Dummy broadcast placeholder functions.
|
2023-06-23 14:07:05 +01:00 |
|
|
8add5a5f49
|
Backport.
|
2023-06-23 14:17:39 +02:00 |
|
|
5e54f37fe1
|
Adding embedding op (not generic gather, no select).
|
2023-06-23 13:13:26 +02:00 |
|
|
4ffdeb4e23
|
Optimize for the contiguous case.
|
2023-06-23 11:23:49 +01:00 |
|
|
1a90f9d3a6
|
Cuda implementation for copying data around.
|
2023-06-23 11:18:29 +01:00 |
|
|
3b550a56dc
|
Transfer tensors between devices.
|
2023-06-23 08:35:22 +01:00 |
|
|
2231c717d5
|
Fix the matmul example.
|
2023-06-22 21:11:41 +01:00 |
|
|
6463d661d8
|
Tweaks.
|
2023-06-22 20:25:14 +01:00 |
|
|
aebffcfc13
|
Add a matmul cuda example.
|
2023-06-22 19:44:26 +01:00 |
|
|
0671b8c369
|
Improve the gemm config.
|
2023-06-22 19:24:02 +01:00 |
|
|
cc78900922
|
Start adding the cublas based matmul.
|
2023-06-22 18:45:10 +01:00 |
|
|
683730c21d
|
Add the cublas handle to the cuda device.
|
2023-06-22 18:03:53 +01:00 |
|
|
7d9a8ff3f9
|
Do not ignore errors when cloning the storage.
|
2023-06-22 16:29:18 +01:00 |
|
|
065b7a19c7
|
Stride support for unary ops.
|
2023-06-22 15:46:34 +01:00 |
|
|
5b1ab5b687
|
Support strides in affine.
|
2023-06-22 15:38:42 +01:00 |
|
|
836ad5f76c
|
Remove one level of indirection for the binary and unary ops.
|
2023-06-22 15:20:51 +01:00 |
|
|
5276755fb3
|
Add cuda support for unary ops.
|
2023-06-22 15:12:59 +01:00 |
|
|
b8f514d9c6
|
Add more binary kernels.
|
2023-06-22 14:07:02 +01:00 |
|
|
e1eb86db61
|
Add some first binary op (add).
|
2023-06-22 13:52:02 +01:00 |
|
|
083ced4428
|
Integrate the kernels bits.
|
2023-06-22 09:59:00 +01:00 |
|
|
1309932933
|
Polish a bit the kernel loading.
|
2023-06-22 09:16:43 +01:00 |
|
|
0a758ffa05
|
Add the fill kernel and use it for 'ones'.
|
2023-06-22 08:33:32 +01:00 |
|
|
fc26bab3ed
|
Add some specific errors rather than panicking.
|
2023-06-22 07:51:53 +01:00 |
|
|
7c46de9584
|
Check that the tensor is contiguous before applying the kernel.
|
2023-06-21 21:28:59 +01:00 |
|
|
304a557d84
|
Add a dummy module.
|
2023-06-21 21:16:00 +01:00 |
|
|
97d9142dee
|
Add a first kernel.
|
2023-06-21 20:48:22 +01:00 |
|
|
deb6091099
|
Use a type alias for cuda errors.
|
2023-06-21 19:50:00 +01:00 |
|
|
71735c7a02
|
Move the data between the host and the device.
|
2023-06-21 19:43:25 +01:00 |
|