* Rework the var-builder to handle initializations.
* Add some helper functions for layer creation.
* Improve the layer initializations.
* Get initialized variables.
* Precompute the rot embeddings when training lamas.
* Rework the commands and run inference by default.
* Add the training module and load the training dataset.
* Random dataset iterator.
* Proper valid-loss computation.
* Compute the evaluation loss.
* Add more substance to the training loop.
* Start adding llama2.c.
* Model loading.
* Add the llama-v2 model.
* Start converting the weights.
* Rotary embedding tweaks.
* Get the model to generate some tokens.