|
1936a1f0a3
|
Bugfix for the strided copy + add some assertions.
|
2023-06-23 16:28:18 +01:00 |
|
|
bcfbb1dca1
|
More efficient CPU broadcasting implementation.
|
2023-06-23 16:23:12 +01:00 |
|
|
10a5807dff
|
Broadcast cpu implementation.
|
2023-06-23 16:16:52 +01:00 |
|
|
83e75b3af8
|
Optimize for the unstrided case.
|
2023-06-23 15:49:11 +01:00 |
|
|
08394f7924
|
Binary op for u32.
|
2023-06-23 14:50:52 +01:00 |
|
|
92da45879c
|
Dummy broadcast placeholder functions.
|
2023-06-23 14:07:05 +01:00 |
|
|
7c1625f6a5
|
Merge pull request #6 from LaurentMazare/add_embedding
Adding embedding op (not generic gather, no select).
|
2023-06-23 13:49:13 +02:00 |
|
|
52c503ba8f
|
Handle the contiguous case in an optimized way when copying cpu memory.
|
2023-06-23 12:20:16 +01:00 |
|
|
96289bce08
|
Rebase.
|
2023-06-23 13:17:21 +02:00 |
|
|
5e54f37fe1
|
Adding embedding op (not generic gather, no select).
|
2023-06-23 13:13:26 +02:00 |
|
|
4712dcc2f6
|
Actually copy the data around in cat (cpu only).
|
2023-06-23 10:24:02 +01:00 |
|
|
3b550a56dc
|
Transfer tensors between devices.
|
2023-06-23 08:35:22 +01:00 |
|
|
836ad5f76c
|
Remove one level of indirection for the binary and unary ops.
|
2023-06-22 15:20:51 +01:00 |
|
|
a8b6c848e0
|
Final updates.
|
2023-06-22 12:39:33 +02:00 |
|
|
04cf14f35a
|
Moving to gemm and adding matmul backprop.
- Tentative `T` operator.
|
2023-06-22 12:37:02 +02:00 |
|
|
ce977b489e
|
Adding matmul?
|
2023-06-22 12:25:58 +02:00 |
|
|
68f525f321
|
Move more bits to the backend part.
|
2023-06-21 10:34:51 +01:00 |
|
|
eb52b9b343
|
Move the cpu backend specific bits apart.
|
2023-06-21 10:25:56 +01:00 |
|