candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
laurent	ebb0fedf14	Very simple pyo3 bindings for candle.	2023-07-01 20:36:44 +01:00
Laurent Mazare	dd879f5b67	Merge pull request #51 from LaurentMazare/custom-prompt Add a flag for custom prompt.	2023-07-01 06:40:36 +01:00
laurent	7c65e2d187	Add a flag for custom prompt.	2023-07-01 06:36:22 +01:00
Laurent Mazare	2c04bff12f	Merge pull request #50 from LaurentMazare/rayon1 Do not use rayon for a single thread	2023-06-30 18:56:26 +01:00
laurent	bbe0c5fbaa	Do not use rayon for a single thread (bis).	2023-06-30 18:47:22 +01:00
laurent	6b67d25d9f	Do not use rayon for a single thread.	2023-06-30 18:46:32 +01:00
Laurent Mazare	b8b175c01e	Merge pull request #49 from LaurentMazare/llama-dtype Early conversion for the llama weights.	2023-06-30 16:43:56 +01:00
laurent	679b6987b6	Early conversion for the llama weights.	2023-06-30 16:42:53 +01:00
Laurent Mazare	dbd7d5b3fd	Merge pull request #47 from LaurentMazare/llama-f32 Add a const to easily tweak the dtype used by llama	2023-06-30 15:04:33 +01:00
laurent	ed4d0959d3	Add a const to easily tweak the dtype used for llama internal computations.	2023-06-30 15:01:39 +01:00
Laurent Mazare	a243504f53	Merge pull request #46 from LaurentMazare/bugfix-cuda-u8-bf16 Bugfix: remove the u8/bf16 conversion kernel as it is ambiguous.	2023-06-30 10:45:48 +01:00
laurent	313fa022a5	Bugfix: remove the u8/bf16 conversion kernel as it is ambiguous.	2023-06-30 10:43:32 +01:00
Laurent Mazare	d2ab4f86bf	Merge pull request #45 from LaurentMazare/u8 Add support for u8	2023-06-30 10:35:51 +01:00
laurent	fbc329ed85	Add the verbose cpu cast operations.	2023-06-30 10:33:29 +01:00
laurent	8ad47907f3	Add the kernels.	2023-06-30 10:26:56 +01:00
Laurent Mazare	a7b16cbb98	Merge pull request #44 from LaurentMazare/check-dim Improve how we check that the dims are in bounds.	2023-06-30 09:14:45 +01:00
laurent	19cbbc5212	Improve how we check that the dims are in bounds.	2023-06-30 09:11:00 +01:00
Laurent Mazare	00476d37f8	Merge pull request #43 from LaurentMazare/bf16 Support for bf16 in cuda kernels	2023-06-30 05:48:58 +01:00
laurent	6486a6d7b2	Avoid some cast kernels.	2023-06-29 23:23:44 +01:00
laurent	ec79fc43f2	Add the bf16 cuda kernels.	2023-06-29 23:12:02 +01:00
Laurent Mazare	018e017e7e	Merge pull request #42 from LaurentMazare/kv-cache-enable Enable the KV cache	2023-06-29 22:22:11 +01:00
laurent	f6152e74b6	Tweak the kv-cache flag.	2023-06-29 22:16:40 +01:00
laurent	ae3f202f3b	Add a flag.	2023-06-29 22:12:15 +01:00
laurent	23389b1bd7	Enable the KV cache after fixing the caching length and the rope bits.	2023-06-29 22:00:57 +01:00
Laurent Mazare	e87a99d16e	Merge pull request #41 from LaurentMazare/kv-cache Kv cache	2023-06-29 19:11:52 +01:00
laurent	af66f0829e	Revert the new profile.	2023-06-29 19:08:50 +01:00
laurent	b50bd880ce	Only narrow when needed + deactivate the kv cache.	2023-06-29 19:07:52 +01:00
Nicolas Patry	4b148b5414	Merge pull request #40 from LaurentMazare/fix_kernel_cache Fixing kernel cache (a bit brutal for now, but if build triggers, rebuild ALL kernels).	2023-06-29 18:02:06 +02:00
Nicolas Patry	1ea08a19cb	Rerun on new files.	2023-06-29 15:59:58 +00:00
Nicolas Patry	b5bdbef53a	Fixing kernel cache (a bit brutal for now, but if build triggers, rebuild ALL kernels).	2023-06-29 15:51:08 +00:00
laurent	3232df9458	Add some KV cache to llama.	2023-06-29 15:29:40 +01:00
Laurent Mazare	889f7e0971	Merge pull request #39 from LaurentMazare/anyhow-backtrace Add backtraces.	2023-06-29 13:17:53 +01:00
laurent	e27ee98d3f	Add backtraces.	2023-06-29 13:17:20 +01:00
Nicolas Patry	e90f4aad26	Merge pull request #38 from LaurentMazare/llama_f16 Moving llama to f16.	2023-06-29 14:12:31 +02:00
Nicolas Patry	78ec40b077	Typo.	2023-06-29 12:09:53 +00:00
Nicolas Patry	de48e6fd59	Putting back main.	2023-06-29 12:08:35 +00:00
Nicolas Patry	0958c588f6	Putting back seed.	2023-06-29 12:07:21 +00:00
Nicolas Patry	c5e8f788be	Revert some changes.	2023-06-29 12:05:53 +00:00
Nicolas Patry	e63ed6aaa3	Remove unwrap.	2023-06-29 12:04:25 +00:00
Nicolas Patry	2fe1d3e36d	Moving llama to f16.	2023-06-29 12:00:16 +00:00
Laurent Mazare	31396a3b9f	Merge pull request #37 from LaurentMazare/llama-seed Add a seed parameter to llama.	2023-06-29 12:51:45 +01:00
laurent	b4dc9f6108	Add a seed parameter to llama.	2023-06-29 12:47:19 +01:00
Nicolas Patry	53628db3a9	Merge pull request #36 from LaurentMazare/fix_example Simple example fix.	2023-06-29 13:36:05 +02:00
Ubuntu	1913512f42	Simple example fix.	2023-06-29 11:10:57 +00:00
Laurent Mazare	c0719b7781	Merge pull request #35 from LaurentMazare/const-scalar Use broadcasted scalars for const tensors.	2023-06-29 12:10:19 +01:00
laurent	2741b39ad3	Use broadcasted scalars for const tensors.	2023-06-29 11:56:40 +01:00
Nicolas Patry	3872dc4751	Merge pull request #19 from LaurentMazare/llama_safetensors Llama safetensors	2023-06-29 12:49:26 +02:00
Laurent Mazare	5930168457	Merge pull request #34 from LaurentMazare/simpler-dtype-trait Put more requirements on the withdtype trait.	2023-06-29 11:41:17 +01:00
laurent	b4aab7b95f	Put more requirements on the withdtype trait.	2023-06-29 11:37:42 +01:00
Laurent Mazare	c8fc9da737	Merge pull request #33 from LaurentMazare/cuda-map Simplify the dtype matchings in the cuda backend	2023-06-29 10:14:12 +01:00

1 2 3 4 5 ...

321 Commits