55bc3382cf
Allow for different behavior between training and eval ( #1213 )
...
* Forward with training.
* Do not use dropout on vgg evaluation.
2023-10-29 07:53:09 +01:00
7529531056
Add the optimizer trait. ( #702 )
2023-09-01 12:55:39 +01:00
7d753d3acd
Mnist training dropout ( #677 )
...
* Use dropout in the mnist training.
* Fix.
2023-08-30 16:41:01 +01:00
618f4e4c78
Add some documentation. ( #673 )
...
* Add some documentation.
* Bump the crate version.
2023-08-30 11:54:00 +01:00
2d3fcad267
Simplify usage of the pool functions. ( #662 )
...
* Simplify usage of the pool functions.
* Small tweak.
* Attempt at using apply to simplify the convnet definition.
2023-08-29 19:12:16 +01:00
b31d41e26a
Add a convnet training example. ( #661 )
...
* Add a convnet example.
* Dataset fix.
* Randomize batches.
2023-08-29 18:23:01 +01:00
d726484a6d
Re-enable local dir for mnist.
2023-08-28 15:15:27 +02:00
d7a273be51
Training:
...
- Removed a lot of surface (SerializedFileReader ownership is really
painful).
- Moved example + vision to hf.co version.
- Removed feature gate.
2023-08-28 15:15:01 +02:00
4c338b0cd9
VarBuilder cleanup ( #627 )
...
* VarBuilder cleanup.
* Implement the basic varbuilders.
* Add the sharded code.
* Proper support for tensor sharding.
2023-08-27 18:03:26 +01:00
a1812f934f
Add a yolo-v3 example. ( #528 )
...
* Add a couple functions required for yolo.
* Add the yolo-v3 example.
* Add minimum and maximum.
* Use the newly introduced maximum.
* Cuda support for min/max + add some testing.
* Allow for more tests to work with accelerate.
* Fix a typo.
2023-08-20 18:19:37 +01:00
c78ce76501
Add a simple Module trait and implement it for the various nn layers ( #500 )
...
* Start adding the module trait.
* Use the module trait.
* Implement module for qmatmul.
2023-08-18 09:38:22 +01:00
620f83cf66
Add the candle-datasets crate ( #322 )
...
* Move the vision datasets to a separate crate.
* Move the batcher bits.
* Update the readme.
* Move the tiny-stories bits.
---------
Co-authored-by: Jane Doe <jane.doe@example.org >
2023-08-05 08:56:50 +01:00
ff876c2103
Llama more training ( #297 )
...
* Rework the var-builder to handle initializations.
* Add some helper functions for layer creation.
* Improve the layer initializations.
* Get initialized variables.
* Precompute the rot embeddings when training lamas.
2023-08-01 19:53:41 +01:00
ffeafbfc43
Make the nll op closer to the pytorch version + add a test. ( #286 )
2023-07-31 14:14:01 +01:00
62a9b03715
Add a flag to set the number of epochs in the mnist training ( #283 )
...
* Add a flag to change the number of epochs for the mnist training.
* Increase the learning rate for the MLP.
2023-07-31 10:32:14 +01:00
a8d8f9f206
Load a trained checkpoint in the mnist example. ( #280 )
2023-07-30 17:01:45 +01:00
38ff693af0
Add a flag to save the trained weights. ( #279 )
2023-07-30 15:41:42 +01:00