candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	1f26042693	Move some shared functions to the nn module. (#221 )	2023-07-22 13:25:11 +01:00
Laurent Mazare	43c7223292	Rename the .r functions to .dims so as to be a bit more explicit. (#220 )	2023-07-22 10:39:27 +01:00
Laurent Mazare	52c5d8c087	Add the gather op. (#219 ) * Start adding gather. * Gather cpu implementation + use in simple training. * Add scatter_add for the gradient of gather. * Simple cpu implementation of scatter_add. * Use gather in the simple-training backprop.	2023-07-22 07:21:28 +01:00
Laurent Mazare	410654525f	Refactor the reduce ops in order to introduce argmin/argmax. (#212 ) * Refactor the reduce ops in order to introduce argmin/argmax. * Clippy fixes. * Use the newly introduced argmax. * Fix the strided case. * Handle the non-contiguous case.	2023-07-21 11:41:08 +01:00
Laurent Mazare	c60831aad4	Add more gradient tests + bugfixes. (#211 ) * Add more gradient tests + bugfixes. * More tests and fixes. * More tests.	2023-07-21 06:52:39 +01:00
Laurent Mazare	4845d5cc64	More realistic training setup. (#210 ) * More realistic training setup. * Compute the model accuracy. * Very inefficient backprop for index select. * More backprop. * Fix some backprop issues. * Backprop fix. * Another broadcasting backprop fix. * Better backprop for reducing ops. * Training again. * Add some gradient tests. * Get the training to work.	2023-07-20 18:25:41 +01:00
Laurent Mazare	12d6dc018d	Support for MQA for llama v2. (#205 ) * Support for MQA for llama v2. * More llama-v2. * Move the rotary embedding precomputation in the cache. * Add a v2 flag. * Use the hf model.	2023-07-20 06:39:04 +01:00
Nicolas Patry	9515e8ea6c	Merge branch 'main' into remove_wrapper	2023-07-19 18:53:55 +02:00
Nicolas Patry	e6584476c4	Merge pull request #200 from LaurentMazare/removing_candle_hub Removing `candle-hub` internal to extract into `hf-hub` standalone.	2023-07-19 17:27:55 +02:00
Laurent Mazare	cb687b4897	Add some more developed training examples. (#199 ) * Use contiguous tensors for variables. * Sketch the mnist example. * Start adding the reduce ops. * Renaming. * Refactor the reduce operations. * Bugfix for the broadcasting vectorization.	2023-07-19 15:37:52 +01:00
Nicolas Patry	dfd624dbd3	[Proposal] Remove SafeTensor wrapper (allows finer control for users).	2023-07-19 16:25:44 +02:00
Nicolas Patry	439321745a	Removing `candle-hub` internal to extract into `hf-hub` standalone.	2023-07-19 15:04:38 +02:00
Laurent Mazare	ff61a42ad7	Use mkl to accelerate binary ops. (#190 ) * Vectorized binary ops with mkl. * Improve the binary op mkl support. * Push the support for mkl binary ops. * Proper vectorization of binary ops. * Proper mkl'isation when broadcasting binary ops.	2023-07-18 12:04:39 +01:00
Laurent Mazare	b706f32839	Add Shape try into (#189 ) * Add the TryInto trait for shapes. * Use the vectorized operations in block mode too.	2023-07-18 10:52:16 +01:00
Laurent Mazare	d6313d2447	Add more tracing details to bert. (#188 )	2023-07-18 08:11:05 +01:00
Laurent Mazare	f0cccd08f0	Bert tracing (#184 ) * Add some tracing to bert. * More tracing. * Add a flag for tracing.	2023-07-17 19:40:42 +01:00
Laurent Mazare	66750f9827	Add some 'cuda-if-available' helper function. (#172 )	2023-07-15 08:25:15 +01:00
Nicolas Patry	4ed56d7861	Removing cuda default. Seems very important for a lot of exploring users usually on laptop without GPUs. Adding more README instructions in a follow up.	2023-07-14 16:52:15 +02:00
Laurent Mazare	a2f72edc0d	Simplify the parameters used by sum and sum_keepdim. (#165 )	2023-07-14 08:22:08 +01:00
Laurent Mazare	2bfa791336	Use the same default as pytorch for sum. (#164 )	2023-07-13 21:32:32 +01:00
Laurent Mazare	3c02ea56b0	Add a cli argument to easily switch the dtype. (#161 )	2023-07-13 19:18:49 +01:00
Laurent Mazare	50b0946a2d	Tensor mutability (#154 ) * Working towards tensor mutability. * Use a ref-cell to provide tensor mutability.	2023-07-13 11:04:40 +01:00
Laurent Mazare	a3663ce2f2	Encodec forward pass (#153 ) * Sketch the forward pass for encodec. * Forward pass for the encodec resnet block. * Encodec decoding.	2023-07-13 08:18:39 +01:00
Laurent Mazare	6c75a98ad2	Add the forward pass for the T5 model. (#152 ) * Add the forward pass for the T5 model. * More t5 forward pass.	2023-07-12 22:02:40 +01:00
Laurent Mazare	ba35d895e7	Sketch the candle-transformers crate. (#147 ) * Sketch the candle-transformers crate. * Format the empty files.	2023-07-12 13:49:31 +01:00
Laurent Mazare	eae646d322	Use arange in the examples. (#146 )	2023-07-12 12:12:34 +01:00
Laurent Mazare	20599172ac	Add from_iter and arange, use it in the doctests. (#145 )	2023-07-12 12:03:01 +01:00
Laurent Mazare	b3b39cca92	Llama batch (#144 ) * Add a batch dimension to llama. * Bugfixes.	2023-07-12 11:38:19 +01:00
Laurent Mazare	fa760759e5	Allow for lazy loading of npz files, use it in llama to reduce memory usage in the cpu version. (#141 )	2023-07-11 20:22:34 +01:00
Laurent Mazare	37cad85869	Resurrect the llama npy support. (#140 )	2023-07-11 19:32:10 +01:00
Laurent Mazare	760f1d7055	Refactor the llama example to make it more in sync with the other ones. (#139 ) * Refactor the llama example to make it more in sync with the other ones. * Make clippy happy. * Properly load the safetensor weights. * Get llama back to a working state for the safetensors case.	2023-07-11 17:20:55 +01:00
Laurent Mazare	674eb35e10	Remove some dead-code pragmas. (#137 )	2023-07-11 09:33:59 +01:00
Laurent Mazare	0e9d3afd77	Simplify the var-builder layer setup. (#133 )	2023-07-10 23:22:58 +01:00
Laurent Mazare	6fc1ab4f0d	MusicGen var-store path cleanup. (#132 )	2023-07-10 23:13:11 +01:00
Laurent Mazare	b46c28a2ac	VarBuilder path creation (#131 ) * Use a struct for the safetensor+routing. * Group the path and the var-builder together. * Fix for the empty path case.	2023-07-10 22:37:34 +01:00
Laurent Mazare	1aa7fbbc33	Move the var-builder in a central place. (#130 )	2023-07-10 20:49:50 +01:00
Laurent Mazare	89a5b602a6	Move the conv1d layer to candle_nn. (#117 )	2023-07-10 11:02:06 +01:00
Laurent Mazare	b06e1a7e54	[nn] Move the Embedding and Activation parts. (#116 ) * Share the Embedding and Activation parts. * Tweak some activations.	2023-07-10 10:24:52 +01:00
Laurent Mazare	9ce0f1c010	Sketch the candle-nn crate. (#115 ) * Sketch the candle-nn crate. * Tweak the cuda dependencies. * More cuda tweaks.	2023-07-10 08:50:09 +01:00
Laurent Mazare	ea5dfa69bc	Sketching the musicgen model. (#66 ) * Skeleton files for musicgen. * Add a musicgen model module. * Sketch the model loading. * Start adding the forward pass. * More forward pass. * Positional embeddings. * Forward for the decoder layers. * Add an empty function. * Fix the musicgen weight names. * More musicgen modeling. * Add the T5 loading bits. * Add the encodec config. * Add the encodec module hierarchy. * More Encodec modeling. * Encodec modeling. * Encodec modeling. * Add more to the encodec modeling. * Load the weights. * Populate the resnet blocks. * Also load the conv transpose weights. * Split musicgen in multiple files.	2023-07-09 19:53:35 +01:00
Laurent Mazare	c187f347bf	Make it easier to use whisper samples from the repo. (#112 ) * Make it easier to use samples from the repo. * Use f32 for accumulation in the f16/bf16 kernels.	2023-07-08 18:48:27 +01:00
Laurent Mazare	f35cfc5e97	Sample with temperature. (#106 )	2023-07-07 18:12:25 +01:00
Laurent Mazare	03dffe9ecc	Use F32 for the reduce ops. (#105 )	2023-07-07 17:55:21 +01:00
Laurent Mazare	e923b3adc2	Add a KV cache to falcon. (#104 )	2023-07-07 17:24:38 +01:00
Laurent Mazare	05ff1cff66	Add some caching to the causal mask. (#103 )	2023-07-07 12:56:44 +01:00
Nicolas Patry	2df044f9a1	Clippy after rebase.	2023-07-07 09:22:09 +02:00
Nicolas Patry	1ec221a749	Fixing falcon example.	2023-07-07 09:13:55 +02:00
Laurent Mazare	d38a926c14	Convert the logits to f32 before extracting them. (#102 )	2023-07-07 08:07:57 +01:00
Laurent Mazare	bac4ef40f3	Add some text generation pipeline for falcon. (#98 )	2023-07-07 06:34:22 +01:00
Laurent Mazare	2b8e8c9f14	Bugfixes. (#97 )	2023-07-06 23:26:11 +01:00

1 2 3

119 Commits