ce5f8dd129
Check the bounds in the cuda indexing kernels. ( #2908 )
...
* Check the bounds in the cuda indexing kernels.
* Another check.
2025-04-18 20:08:17 +02:00
fc1fe5e45b
Support scatter/index_add with i64 indices for f16 ( #1915 )
2024-03-22 11:51:41 +01:00
8f7973958c
fix: fix index_select cuda kernel for src target dim different than ids dim when selecting dim > 0 ( #1037 )
...
* fix: fix index_select cuda kernel for src target dim different than ids dim when selecting dim > 0
* cargo fmt
2023-10-05 18:46:13 +01:00
5e1c595e00
Optimize the index-select cuda kernel. ( #976 )
2023-09-28 09:05:29 +01:00
9a5c7db91a
Add support for i64 ( #563 )
...
* Add the i64 dtype.
* Adapt the cuda kernels.
2023-08-23 10:42:19 +01:00
4b3bd79fbd
Remove the embedding ops in favor of index-select. ( #299 )
...
* Remove the embedding ops in favor of index-select.
* Also remove the cuda kernels.
2023-08-02 05:42:11 +01:00
944d70bd9a
Add a test for scatter add. ( #238 )
...
* Add a test for scatter add (segfaults on gpus for now).
* Bugfix for the scatter add cuda kernel.
2023-07-25 09:12:14 +01:00
74a6a769dd
Cuda kernels for IndexAdd/ScatterAdd. ( #236 )
...
* Skeleton methods for IndexAdd/ScatterAdd.
* Add a Map2InPlace trait.
* Add the glue code for the index-add/scatter-add kernels.
* Tweak the file name: embeddings -> indexing.
* Add the cuda kernel for indexadd.
* And add the scatter-add kernels.
2023-07-24 21:53:08 +01:00