3ceca9901a
Enable the new layer-norm. ( #2213 )
...
* Enable the new layer-norm.
* Shape fixes.
2024-05-24 16:48:21 +02:00
b2e816752b
Use the faster rms-norm kernel for llama. ( #2107 )
...
* Use the faster rms-norm kernel for llama.
* Use the fast variant by default.
2024-04-22 18:52:00 +02:00
e6697471bb
Add weight and bias functions to LayerNorm ( #1306 )
2023-11-09 16:09:01 +01:00
7b1ddcff47
Add clone to various nn layers. ( #910 )
2023-09-20 11:33:51 +01:00
7396b8ed1a
Segment Anything - process images ( #766 )
...
* Start processing images.
* Add LayerNorm2d.
* Properly use LayerNorm2d.
* Tweak eps.
* Use LayerNorm on inputs with a rank different from 3.
* Window partitioning.
* Fix a couple todos.
* More todos.
* Hard-code the einsums.
* More padding support.
* Some sizes tweaks.
* Use the hub to get the weights.
* Use a batch matmul.
* Tweaks.
* More fixes.
* Get some predictions to be generated.
2023-09-07 19:22:45 +01:00
2047d34b7c
More robust tests (so that they pass on accelerate). ( #679 )
2023-08-30 18:10:10 +01:00
4c338b0cd9
VarBuilder cleanup ( #627 )
...
* VarBuilder cleanup.
* Implement the basic varbuilders.
* Add the sharded code.
* Proper support for tensor sharding.
2023-08-27 18:03:26 +01:00
c78ce76501
Add a simple Module trait and implement it for the various nn layers ( #500 )
...
* Start adding the module trait.
* Use the module trait.
* Implement module for qmatmul.
2023-08-18 09:38:22 +01:00
13401df4d1
Add an abstract type for RmsNorm. ( #499 )
2023-08-18 08:52:14 +01:00
d32e8199cd
Layer norm tweaks ( #482 )
...
* Add some options to make layer-norm more configurable.
* Add the rms-norm variant.
* Replace the RmsNorm with the shared bits.
2023-08-17 10:07:13 +01:00
ff876c2103
Llama more training ( #297 )
...
* Rework the var-builder to handle initializations.
* Add some helper functions for layer creation.
* Improve the layer initializations.
* Get initialized variables.
* Precompute the rot embeddings when training lamas.
2023-08-01 19:53:41 +01:00
43c7223292
Rename the .r functions to .dims so as to be a bit more explicit. ( #220 )
2023-07-22 10:39:27 +01:00
a2f72edc0d
Simplify the parameters used by sum and sum_keepdim. ( #165 )
2023-07-14 08:22:08 +01:00
2bfa791336
Use the same default as pytorch for sum. ( #164 )
2023-07-13 21:32:32 +01:00
465fc8c0c5
Add some documentation and test to the linear layer. ( #151 )
...
* Add some documentation and test to the linear layer.
* Layer norm doc.
* Minor tweaks.
2023-07-12 20:24:23 +01:00
9ce0f1c010
Sketch the candle-nn crate. ( #115 )
...
* Sketch the candle-nn crate.
* Tweak the cuda dependencies.
* More cuda tweaks.
2023-07-10 08:50:09 +01:00