Commit Graph

17 Commits

Author SHA1 Message Date
55bc3382cf Allow for different behavior between training and eval (#1213)
* Forward with training.

* Do not use dropout on vgg evaluation.
2023-10-29 07:53:09 +01:00
7529531056 Add the optimizer trait. (#702) 2023-09-01 12:55:39 +01:00
7d753d3acd Mnist training dropout (#677)
* Use dropout in the mnist training.

* Fix.
2023-08-30 16:41:01 +01:00
618f4e4c78 Add some documentation. (#673)
* Add some documentation.

* Bump the crate version.
2023-08-30 11:54:00 +01:00
2d3fcad267 Simplify usage of the pool functions. (#662)
* Simplify usage of the pool functions.

* Small tweak.

* Attempt at using apply to simplify the convnet definition.
2023-08-29 19:12:16 +01:00
b31d41e26a Add a convnet training example. (#661)
* Add a convnet example.

* Dataset fix.

* Randomize batches.
2023-08-29 18:23:01 +01:00
d726484a6d Re-enable local dir for mnist. 2023-08-28 15:15:27 +02:00
d7a273be51 Training:
- Removed a lot of surface (SerializedFileReader ownership is really
  painful).
- Moved example + vision to hf.co version.
- Removed feature gate.
2023-08-28 15:15:01 +02:00
4c338b0cd9 VarBuilder cleanup (#627)
* VarBuilder cleanup.

* Implement the basic varbuilders.

* Add the sharded code.

* Proper support for tensor sharding.
2023-08-27 18:03:26 +01:00
a1812f934f Add a yolo-v3 example. (#528)
* Add a couple functions required for yolo.

* Add the yolo-v3 example.

* Add minimum and maximum.

* Use the newly introduced maximum.

* Cuda support for min/max + add some testing.

* Allow for more tests to work with accelerate.

* Fix a typo.
2023-08-20 18:19:37 +01:00
c78ce76501 Add a simple Module trait and implement it for the various nn layers (#500)
* Start adding the module trait.

* Use the module trait.

* Implement module for qmatmul.
2023-08-18 09:38:22 +01:00
620f83cf66 Add the candle-datasets crate (#322)
* Move the vision datasets to a separate crate.

* Move the batcher bits.

* Update the readme.

* Move the tiny-stories bits.

---------

Co-authored-by: Jane Doe <jane.doe@example.org>
2023-08-05 08:56:50 +01:00
ff876c2103 Llama more training (#297)
* Rework the var-builder to handle initializations.

* Add some helper functions for layer creation.

* Improve the layer initializations.

* Get initialized variables.

* Precompute the rot embeddings when training lamas.
2023-08-01 19:53:41 +01:00
ffeafbfc43 Make the nll op closer to the pytorch version + add a test. (#286) 2023-07-31 14:14:01 +01:00
62a9b03715 Add a flag to set the number of epochs in the mnist training (#283)
* Add a flag to change the number of epochs for the mnist training.

* Increase the learning rate for the MLP.
2023-07-31 10:32:14 +01:00
a8d8f9f206 Load a trained checkpoint in the mnist example. (#280) 2023-07-30 17:01:45 +01:00
38ff693af0 Add a flag to save the trained weights. (#279) 2023-07-30 15:41:42 +01:00