|
8ed350dc94
|
Add a couple unitary ops.
|
2023-06-23 20:19:20 +01:00 |
|
|
5ca309ecb0
|
Optimize the unary cuda kernels for the contiguous case.
|
2023-06-23 18:40:15 +01:00 |
|
|
f8848db001
|
Fix the gelu kernel for f16.
|
2023-06-23 13:38:54 +01:00 |
|
|
09b7731b8d
|
Fix unary op.
|
2023-06-23 13:10:26 +02:00 |
|
|
56ae71dd4c
|
Address comments.
|
2023-06-23 13:08:04 +02:00 |
|
|
fd21c708ab
|
Creating Gelu op (no backward).
|
2023-06-23 13:07:39 +02:00 |
|
|
1a90f9d3a6
|
Cuda implementation for copying data around.
|
2023-06-23 11:18:29 +01:00 |
|
|
065b7a19c7
|
Stride support for unary ops.
|
2023-06-22 15:46:34 +01:00 |
|
|
5276755fb3
|
Add cuda support for unary ops.
|
2023-06-22 15:12:59 +01:00 |
|