candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 02:38:10 +00:00

Author	SHA1	Message	Date
laurent	0988706c88	Support wider shapes for llama.	2023-06-24 20:08:18 +01:00
laurent	6b2cd9c51c	Add the broadcast operator.	2023-06-24 19:16:03 +01:00
laurent	96c098b6cd	Remove the unecessary features.	2023-06-24 18:15:44 +01:00
laurent	a7f80e258f	Read and write npy files.	2023-06-24 18:12:10 +01:00
laurent	a6ca9baf3c	Backprop for narrow.	2023-06-24 15:17:57 +01:00
laurent	fbbf3951dd	More narrow testing.	2023-06-24 15:10:31 +01:00
laurent	0f34738831	Fix the cpu implementation for narrow.	2023-06-24 15:01:32 +01:00
laurent	1b5f892d73	Add a currently wrong test for narrow.	2023-06-24 08:50:37 +01:00
laurent	d6cb4f1c53	Add the source offset when copying the data around.	2023-06-24 08:35:49 +01:00
laurent	4db972781f	Handle copying for the u32 type.	2023-06-24 08:24:06 +01:00
laurent	dd657397b2	Skeleton implementation for the narrow method and op.	2023-06-24 08:17:35 +01:00
laurent	3deacba5f9	Reshape can now return a view.	2023-06-24 07:14:09 +01:00
laurent	47f9c48e7c	Avoid duplicating the storage by refcounting it.	2023-06-24 07:03:21 +01:00
laurent	b4653e41be	Helper function to build 3d arrays.	2023-06-24 06:29:06 +01:00
laurent	ae5dc5fbc6	Softmax tests + fix.	2023-06-23 22:46:36 +01:00
laurent	d0a91db8fd	Softmax cpu implementation.	2023-06-23 22:26:53 +01:00
laurent	8443963d4f	Skeleton implementation for softmax.	2023-06-23 22:00:13 +01:00
laurent	5d44e76e3f	Add the casting operation.	2023-06-23 21:22:07 +01:00
laurent	8ed350dc94	Add a couple unitary ops.	2023-06-23 20:19:20 +01:00
laurent	fe75a01188	Cleanup the tensor creation code.	2023-06-23 19:52:21 +01:00
laurent	88187b784b	Also optimize the contiguous case for the binary cuda kernels.	2023-06-23 19:04:13 +01:00
laurent	5ca309ecb0	Optimize the unary cuda kernels for the contiguous case.	2023-06-23 18:40:15 +01:00
laurent	4f9f14a06b	Optimize the cpu backend for the contiguous cases.	2023-06-23 18:08:55 +01:00
laurent	132859df75	Add some transpose tests.	2023-06-23 17:49:53 +01:00
laurent	691f7d8e0f	Cosmetic fix.	2023-06-23 16:43:45 +01:00
laurent	69f91b36f9	More backprop support for broadcasting ops.	2023-06-23 16:35:10 +01:00
laurent	d839d5d9fd	Basic support for broadcasting backprop.	2023-06-23 16:31:44 +01:00
laurent	1936a1f0a3	Bugfix for the strided copy + add some assertions.	2023-06-23 16:28:18 +01:00
laurent	bcfbb1dca1	More efficient CPU broadcasting implementation.	2023-06-23 16:23:12 +01:00
laurent	10a5807dff	Broadcast cpu implementation.	2023-06-23 16:16:52 +01:00
laurent	83e75b3af8	Optimize for the unstrided case.	2023-06-23 15:49:11 +01:00
laurent	4c8931d2e4	More u32 support.	2023-06-23 14:54:03 +01:00
laurent	08394f7924	Binary op for u32.	2023-06-23 14:50:52 +01:00
laurent	92da45879c	Dummy broadcast placeholder functions.	2023-06-23 14:07:05 +01:00
laurent	f8848db001	Fix the gelu kernel for f16.	2023-06-23 13:38:54 +01:00
Nicolas Patry	db5526d51a	Merge pull request #8 from LaurentMazare/fix_cuda Backport.	2023-06-23 14:27:01 +02:00
Nicolas Patry	8add5a5f49	Backport.	2023-06-23 14:17:39 +02:00
Nicolas Patry	7c1625f6a5	Merge pull request #6 from LaurentMazare/add_embedding Adding embedding op (not generic gather, no select).	2023-06-23 13:49:13 +02:00
Nicolas Patry	2fb87edda5	Address comments.	2023-06-23 13:43:18 +02:00
laurent	52c503ba8f	Handle the contiguous case in an optimized way when copying cpu memory.	2023-06-23 12:20:16 +01:00
Nicolas Patry	d4054ab500	Merge pull request #5 from LaurentMazare/add_gelu Creating Gelu op (no backward).	2023-06-23 13:17:37 +02:00
Nicolas Patry	96289bce08	Rebase.	2023-06-23 13:17:21 +02:00
Nicolas Patry	5e54f37fe1	Adding embedding op (not generic gather, no select).	2023-06-23 13:13:26 +02:00
Nicolas Patry	09b7731b8d	Fix unary op.	2023-06-23 13:10:26 +02:00
Nicolas Patry	56ae71dd4c	Address comments.	2023-06-23 13:08:04 +02:00
Nicolas Patry	fd21c708ab	Creating Gelu op (no backward).	2023-06-23 13:07:39 +02:00
laurent	4ffdeb4e23	Optimize for the contiguous case.	2023-06-23 11:23:49 +01:00
laurent	1a90f9d3a6	Cuda implementation for copying data around.	2023-06-23 11:18:29 +01:00
laurent	79e4b29c2f	Add the reshape method and operation (without grad for now).	2023-06-23 10:51:05 +01:00
laurent	c4c6167949	Add the continuous method.	2023-06-23 10:45:20 +01:00

... 43 44 45 46 47

2339 Commits