33c23c19b6
Preliminary support for SDXL. ( #647 )
...
* Preliminary support for SDXL.
* More SDXL support.
* More SDXL.
* Use the proper clip config.
* Querying for existing tensors.
* More robust test.
2023-08-29 09:00:04 +01:00
4c338b0cd9
VarBuilder cleanup ( #627 )
...
* VarBuilder cleanup.
* Implement the basic varbuilders.
* Add the sharded code.
* Proper support for tensor sharding.
2023-08-27 18:03:26 +01:00
431051cc32
Add Efficientnet ( #572 )
...
* EfficientNet.
* Complete the efficientnet implementation.
* Improve group handling.
* Get the efficientnet to work.
2023-08-23 18:02:58 +01:00
d2622a8160
Move the VarMap to a separate file ( #525 )
...
* Move the var-map struct in a separate file.
* Fix some typos.
2023-08-20 14:25:07 +01:00
55e428c8ae
Expose the varmap inner data. ( #411 )
2023-08-11 16:58:56 +01:00
ff876c2103
Llama more training ( #297 )
...
* Rework the var-builder to handle initializations.
* Add some helper functions for layer creation.
* Improve the layer initializations.
* Get initialized variables.
* Precompute the rot embeddings when training lamas.
2023-08-01 19:53:41 +01:00
07eb899729
More mnist training. ( #275 )
2023-07-29 13:29:31 +01:00
8435a99edd
Added comment about offsets.
2023-07-27 20:11:57 +02:00
952eca6b54
Fixing slice errors + comments.
2023-07-27 16:59:32 +02:00
7c7e6ba201
Removing inner dependency on safetensors.
2023-07-27 09:58:47 +02:00
1735e4831e
TP sharding v2
2023-07-27 09:58:14 +02:00
dfd624dbd3
[Proposal] Remove SafeTensor wrapper (allows finer control for users).
2023-07-19 16:25:44 +02:00
d88b6cdca9
Add backtrace information to errors where relevant. ( #166 )
...
* Add backtrace information to errors where relevant.
* More backtrace information.
* Add to the FAQ.
2023-07-14 09:31:25 +01:00
a76ec797da
Cleanup the main crate error and add a couple dedicated ones ( #142 )
...
* Cosmetic cleanups to the error enum.
* More error cleanup.
* Proper error handling rather than panicing.
* Add some conv1d dedicated error.
2023-07-12 09:17:08 +01:00
fa760759e5
Allow for lazy loading of npz files, use it in llama to reduce memory usage in the cpu version. ( #141 )
2023-07-11 20:22:34 +01:00
37cad85869
Resurrect the llama npy support. ( #140 )
2023-07-11 19:32:10 +01:00
b46c28a2ac
VarBuilder path creation ( #131 )
...
* Use a struct for the safetensor+routing.
* Group the path and the var-builder together.
* Fix for the empty path case.
2023-07-10 22:37:34 +01:00
1aa7fbbc33
Move the var-builder in a central place. ( #130 )
2023-07-10 20:49:50 +01:00