candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	ff876c2103	Llama more training (#297 ) * Rework the var-builder to handle initializations. * Add some helper functions for layer creation. * Improve the layer initializations. * Get initialized variables. * Precompute the rot embeddings when training lamas.	2023-08-01 19:53:41 +01:00
Laurent Mazare	1064b9b031	Add the cross-entropy loss. (#287 )	2023-07-31 14:26:36 +01:00
Laurent Mazare	ffeafbfc43	Make the nll op closer to the pytorch version + add a test. (#286 )	2023-07-31 14:14:01 +01:00
Laurent Mazare	3eb2bc6d07	Softmax numerical stability. (#267 ) * Softmax numerical stability. * Fix the flash-attn test.	2023-07-28 13:13:01 +01:00
Laurent Mazare	a2f72edc0d	Simplify the parameters used by sum and sum_keepdim. (#165 )	2023-07-14 08:22:08 +01:00
Laurent Mazare	2bfa791336	Use the same default as pytorch for sum. (#164 )	2023-07-13 21:32:32 +01:00
Laurent Mazare	57be3638d8	Add the pytorch version of the linear regression as a comment. (#163 ) * Add the pytorch version of the linear regression. * Typo.	2023-07-13 21:05:57 +01:00
Laurent Mazare	23e105cd94	Add the gradient for reduce-sum. (#162 ) * Add the gradient for reduce-sum. * And add the gradient for the broadcast ops. * Add some backprop tests. * Add some linear regression example.	2023-07-13 20:14:10 +01:00
Laurent Mazare	ded93a1169	Add the SGD optimizer (#160 ) * Add the nn::optim and some conversion traits. * Add the backward_step function for SGD. * Get the SGD optimizer to work and add a test. * Make the test slighly simpler.	2023-07-13 19:05:44 +01:00
Laurent Mazare	71cd3745a9	Add some layer-norm tests. (#121 )	2023-07-10 14:43:04 +01:00

10 Commits