candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-19 11:56:45 +00:00

Author	SHA1	Message	Date
Laurent Mazare	01545f7303	Add a slice_set op. (#2193 ) * Add a slice_set op. * Add some testing. * Add the dedicated kv-cache module. * Derive debug and clone. * Expose more kv-cache functions. * Return the current data when appending. * Use the new cache in the quantized phi3 model.	2024-05-18 15:58:18 +02:00
Laurent Mazare	1b98f84a2b	Fast kernels for rotary embeddings. (#1928 ) * Fast kernels for rotary embeddings. * Add a test for the fast CPU kernel. * Rope cuda bindings. * Cuda kernel. * Metal kernel (part 1). * Cuda kernels. * Finish the metal kernel. * Use the new kernels in the quantized example. * Fix warning.	2024-03-24 22:48:52 +01:00
Laurent Mazare	c753f72c85	Support for attention bias in gemma + refactor things a bit. (#1744 ) * Support for attention bias in gemma + refactor things a bit. * Fix the cuda tests.	2024-02-22 09:35:28 +01:00
Laurent Mazare	41416d2376	Expose more conv1d functions/structs. (#1726 )	2024-02-17 18:50:55 +01:00
Ryan Tate	41614b4a9b	Add one-hot/cold encoding (#1489 ) * add one-hot encoding * one_hot: improve error handling, use generic to_vecN::<D> Bails if the index value is equal to or greater than the depth value, which would result in an out-of-bounds error. A redundant check is added to ensure the index value does not exceed the length of the one-hot matrix size, which would also result in an out-of-bounds error. Bails if the index value is less than -1. If the index value is -1, then it ignores the setting of the on_value for the index value. Only values that are less than -1 are considered errors. * one-hot: use two generics, one_hot::<I, O>, for input and output data types Separating the input and output data types allows the input tensor indices to be a different data type than the output encoded tensor data type. For example, one_hot::<i64, u8>(...) will take an input tensor of i64 values and encode the output tensor using u8 values. The generic I::DTYPE must match the data type of the input indices, otherwise the method will bail. Additionally, this method adds an `allow_f64` option to enable the input indices data type to be f64 values. f64 values are disabled by default. TODO: indices data type and the generic I data type are currently not compile-time checked. * one_hot: remove input generic, use indices dtype matching This commit removes the to_f64() type cast and explicitly matches the DType from the input tensor. Currently, only U8, U32 and I64 is supported for input tensors. The match arms on the dtype is verbose. It would be nice to use a generic type with the WithDtype traitbound to pass to the to_vecN method and then return an inner value. Open to suggestions for better approaches here to reduce the match arm verbosity. * one_hot: use flat_map iterator over dims instead of nested for loop This commit replaces the nested for loops with an flat map iter over the dimensions of the input tensor. This commit also adds a test for a rank 3 input tensor. * one_hot: use mandatory on/off-values, remove const msgs This commit also updates doc tests, comments and test cases. * Small cleanups. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-01-01 11:18:40 +01:00
Laurent Mazare	b5c283e86f	Add the prelu layer. (#1402 )	2023-12-03 16:06:09 +00:00
Laurent Mazare	55bc3382cf	Allow for different behavior between training and eval (#1213 ) * Forward with training. * Do not use dropout on vgg evaluation.	2023-10-29 07:53:09 +01:00
Laurent Mazare	99cf13e8e2	Add the sequential layer. (#1136 )	2023-10-20 16:08:50 +01:00
Laurent Mazare	000fa00e31	Expose the conv2d-transpose layers. (#761 )	2023-09-07 06:04:52 +01:00
Laurent Mazare	7529531056	Add the optimizer trait. (#702 )	2023-09-01 12:55:39 +01:00
Laurent Mazare	db59816087	Add a GRU layer. (#688 ) * Add a GRU layer. * Fix the n gate computation.	2023-08-31 08:43:10 +01:00
Laurent Mazare	3159982a89	Add a Dropout layer (#676 ) * Add a dropout layer. * Add an actual layer.	2023-08-30 16:19:28 +01:00
Laurent Mazare	f35b9f6baa	Add some recurrent neural networks (#674 ) * Add the rnn module. * More LSTM. * Implement the RNN forward pass. * More forward pass for LSTM.	2023-08-30 13:27:09 +01:00
Laurent Mazare	2d3fcad267	Simplify usage of the pool functions. (#662 ) * Simplify usage of the pool functions. * Small tweak. * Attempt at using apply to simplify the convnet definition.	2023-08-29 19:12:16 +01:00
Laurent Mazare	e3d2786ffb	Add a couple functions required for yolo. (#527 )	2023-08-20 17:02:05 +01:00
Laurent Mazare	d2622a8160	Move the VarMap to a separate file (#525 ) * Move the var-map struct in a separate file. * Fix some typos.	2023-08-20 14:25:07 +01:00
Laurent Mazare	42e1cc8062	Add a batch normalization layer (#508 ) * Add BatchNormalization. * More batch-norm. * Add some validation of the inputs. * More validation.	2023-08-18 20:05:56 +01:00
Laurent Mazare	c78ce76501	Add a simple Module trait and implement it for the various nn layers (#500 ) * Start adding the module trait. * Use the module trait. * Implement module for qmatmul.	2023-08-18 09:38:22 +01:00
Laurent Mazare	13401df4d1	Add an abstract type for RmsNorm. (#499 )	2023-08-18 08:52:14 +01:00
Laurent Mazare	d32e8199cd	Layer norm tweaks (#482 ) * Add some options to make layer-norm more configurable. * Add the rms-norm variant. * Replace the RmsNorm with the shared bits.	2023-08-17 10:07:13 +01:00
Laurent Mazare	d34039e352	Add a stable diffusion example (#328 ) * Start adding a stable-diffusion example. * Proper computation of the causal mask. * Add the chunk operation. * Work in progress: port the attention module. * Add some dummy modules for conv2d and group-norm, get the attention module to compile. * Re-enable the 2d convolution. * Add the embeddings module. * Add the resnet module. * Add the unet blocks. * Add the unet. * And add the variational auto-encoder. * Use the pad function from utils.	2023-08-06 17:49:43 +01:00
Laurent Mazare	620f83cf66	Add the candle-datasets crate (#322 ) * Move the vision datasets to a separate crate. * Move the batcher bits. * Update the readme. * Move the tiny-stories bits. --------- Co-authored-by: Jane Doe <jane.doe@example.org>	2023-08-05 08:56:50 +01:00
Laurent Mazare	0902846f25	Add the AdamW optimizer. (#307 ) * Add the AdamW optimizer. * Add some AdamW test validated against PyTorch.	2023-08-02 14:03:49 +01:00
Laurent Mazare	ff876c2103	Llama more training (#297 ) * Rework the var-builder to handle initializations. * Add some helper functions for layer creation. * Improve the layer initializations. * Get initialized variables. * Precompute the rot embeddings when training lamas.	2023-08-01 19:53:41 +01:00
Laurent Mazare	e1e8127f15	Add the batcher. (#293 )	2023-08-01 09:16:10 +01:00
Laurent Mazare	16c33383eb	Improve the mnist training example. (#276 ) * Improve the mnist training example. * Add some initialization routine that can be used for nn. * Proper initialization in the mnist example.	2023-07-29 16:28:22 +01:00
Laurent Mazare	1f26042693	Move some shared functions to the nn module. (#221 )	2023-07-22 13:25:11 +01:00
Laurent Mazare	2a74019ec6	Vision dataset (#179 ) * Add some readers for the mnist dataset. * Import the cifar and mnist dataset.	2023-07-16 23:43:55 +01:00
Laurent Mazare	ded93a1169	Add the SGD optimizer (#160 ) * Add the nn::optim and some conversion traits. * Add the backward_step function for SGD. * Get the SGD optimizer to work and add a test. * Make the test slighly simpler.	2023-07-13 19:05:44 +01:00
Laurent Mazare	b31a3bbdcb	Sketch the tensor initialization module. (#134 )	2023-07-11 07:41:46 +01:00
Laurent Mazare	1aa7fbbc33	Move the var-builder in a central place. (#130 )	2023-07-10 20:49:50 +01:00
Laurent Mazare	89a5b602a6	Move the conv1d layer to candle_nn. (#117 )	2023-07-10 11:02:06 +01:00
Laurent Mazare	b06e1a7e54	[nn] Move the Embedding and Activation parts. (#116 ) * Share the Embedding and Activation parts. * Tweak some activations.	2023-07-10 10:24:52 +01:00
Laurent Mazare	9ce0f1c010	Sketch the candle-nn crate. (#115 ) * Sketch the candle-nn crate. * Tweak the cuda dependencies. * More cuda tweaks.	2023-07-10 08:50:09 +01:00

34 Commits