candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-18 11:37:11 +00:00

Author	SHA1	Message	Date
laurent	ed4d0959d3	Add a const to easily tweak the dtype used for llama internal computations.	2023-06-30 15:01:39 +01:00
laurent	fbc329ed85	Add the verbose cpu cast operations.	2023-06-30 10:33:29 +01:00
laurent	8ad47907f3	Add the kernels.	2023-06-30 10:26:56 +01:00
laurent	19cbbc5212	Improve how we check that the dims are in bounds.	2023-06-30 09:11:00 +01:00
laurent	f6152e74b6	Tweak the kv-cache flag.	2023-06-29 22:16:40 +01:00
laurent	ae3f202f3b	Add a flag.	2023-06-29 22:12:15 +01:00
laurent	23389b1bd7	Enable the KV cache after fixing the caching length and the rope bits.	2023-06-29 22:00:57 +01:00
laurent	b50bd880ce	Only narrow when needed + deactivate the kv cache.	2023-06-29 19:07:52 +01:00
laurent	3232df9458	Add some KV cache to llama.	2023-06-29 15:29:40 +01:00
Laurent Mazare	889f7e0971	Merge pull request #39 from LaurentMazare/anyhow-backtrace Add backtraces.	2023-06-29 13:17:53 +01:00
laurent	e27ee98d3f	Add backtraces.	2023-06-29 13:17:20 +01:00
Nicolas Patry	78ec40b077	Typo.	2023-06-29 12:09:53 +00:00
Nicolas Patry	de48e6fd59	Putting back main.	2023-06-29 12:08:35 +00:00
Nicolas Patry	0958c588f6	Putting back seed.	2023-06-29 12:07:21 +00:00
Nicolas Patry	c5e8f788be	Revert some changes.	2023-06-29 12:05:53 +00:00
Nicolas Patry	e63ed6aaa3	Remove unwrap.	2023-06-29 12:04:25 +00:00
Nicolas Patry	2fe1d3e36d	Moving llama to f16.	2023-06-29 12:00:16 +00:00
laurent	b4dc9f6108	Add a seed parameter to llama.	2023-06-29 12:47:19 +01:00
Nicolas Patry	53628db3a9	Merge pull request #36 from LaurentMazare/fix_example Simple example fix.	2023-06-29 13:36:05 +02:00
Ubuntu	1913512f42	Simple example fix.	2023-06-29 11:10:57 +00:00
laurent	2741b39ad3	Use broadcasted scalars for const tensors.	2023-06-29 11:56:40 +01:00
Nicolas Patry	3872dc4751	Merge pull request #19 from LaurentMazare/llama_safetensors Llama safetensors	2023-06-29 12:49:26 +02:00
laurent	b4aab7b95f	Put more requirements on the withdtype trait.	2023-06-29 11:37:42 +01:00
laurent	c9c468e1aa	Use Map2 for binary ops.	2023-06-29 10:09:15 +01:00
laurent	83c7d660ca	Add Map2.	2023-06-29 10:05:06 +01:00
laurent	367170da45	Also use Map1 for embedding.	2023-06-29 09:45:27 +01:00
laurent	8ad03a5fb6	Use Map1 on unary ops.	2023-06-29 09:37:38 +01:00
laurent	fff13dbb4e	Factorize the kernel naming scheme.	2023-06-29 09:29:59 +01:00
laurent	d3c7b0d168	Use Map1 for sum.	2023-06-29 09:27:07 +01:00
laurent	122e334d0c	Simplify the pattern matching logic in the cuda backend.	2023-06-29 09:21:11 +01:00
laurent	eaa3ce359e	Cosmetic change.	2023-06-28 22:02:23 +01:00
laurent	1328b5cb20	Factor some code out.	2023-06-28 21:56:44 +01:00
laurent	c583ee0f2c	Add map2.	2023-06-28 21:38:01 +01:00
laurent	46c07b924c	Tweak some comment.	2023-06-28 21:10:54 +01:00
laurent	2ae368e98e	Switch from a macro to a trait to make things more generic.	2023-06-28 21:06:56 +01:00
Ubuntu	ece3ec6167	Final updates -> moving to deterministic for easier comparison.	2023-06-28 14:56:39 +00:00
Ubuntu	926fffa0b7	Ok.	2023-06-28 14:56:39 +00:00
Ubuntu	e29dae044d	Tmp.	2023-06-28 14:56:38 +00:00
laurent	6c9e6b5a99	Get the cuda tests to pass.	2023-06-28 15:53:23 +01:00
laurent	3f0d9fbb25	Adapt the cuda bits.	2023-06-28 15:43:03 +01:00
laurent	cca699be6c	Fix some cpu issue.	2023-06-28 15:09:15 +01:00
laurent	1c755c0e5b	Remove some todos.	2023-06-28 14:33:06 +01:00
laurent	caafef6cc1	Get the cpu tests to run.	2023-06-28 14:32:02 +01:00
laurent	14449ff80c	Get the cpu backend to compile.	2023-06-28 14:12:38 +01:00
laurent	54a6c40f27	Propagate the changes on the cpu backend.	2023-06-28 14:00:49 +01:00
laurent	303b853098	Propagate the layout refactoring.	2023-06-28 13:42:23 +01:00
laurent	30b355ccd2	Simplify the narrow implementation.	2023-06-28 13:09:59 +01:00
laurent	c1bbbf94f6	Start refactoring the stride.	2023-06-28 12:57:30 +01:00
laurent	7938d2b848	Add the grad for narrow.	2023-06-28 10:46:00 +01:00
laurent	615196e7be	Add more gradients.	2023-06-28 09:59:52 +01:00

... 7 8 9 10 11

517 Commits