7b1ddcff47
Add clone to various nn layers. ( #910 )
2023-09-20 11:33:51 +01:00
59e63d690c
Add weight, bias, and hidden_size methods ( #816 )
...
* Add weight, bias methods to Conv(1|2)
* Add hidden_size method to Embedding
* Expose hidden_size
2023-09-11 16:01:11 +01:00
4c338b0cd9
VarBuilder cleanup ( #627 )
...
* VarBuilder cleanup.
* Implement the basic varbuilders.
* Add the sharded code.
* Proper support for tensor sharding.
2023-08-27 18:03:26 +01:00
c78ce76501
Add a simple Module trait and implement it for the various nn layers ( #500 )
...
* Start adding the module trait.
* Use the module trait.
* Implement module for qmatmul.
2023-08-18 09:38:22 +01:00
cc76c63202
Use index-select for the embeddings as it supports backprop. ( #298 )
2023-08-01 20:44:43 +01:00
ff876c2103
Llama more training ( #297 )
...
* Rework the var-builder to handle initializations.
* Add some helper functions for layer creation.
* Improve the layer initializations.
* Get initialized variables.
* Precompute the rot embeddings when training lamas.
2023-08-01 19:53:41 +01:00
465fc8c0c5
Add some documentation and test to the linear layer. ( #151 )
...
* Add some documentation and test to the linear layer.
* Layer norm doc.
* Minor tweaks.
2023-07-12 20:24:23 +01:00
b06e1a7e54
[nn] Move the Embedding and Activation parts. ( #116 )
...
* Share the Embedding and Activation parts.
* Tweak some activations.
2023-07-10 10:24:52 +01:00