3071ea6c3e
Use the new hub helper function. ( #1484 )
2023-12-26 09:44:30 +01:00
bcb0ed8f1c
Self-contained safetensors for the multiprocess llama example. ( #950 )
2023-09-24 06:54:49 +01:00
098dd0d1e9
fix: add missingtop_p
in llama_multiprocess ( #905 )
2023-09-20 08:54:56 +01:00
4c338b0cd9
VarBuilder cleanup ( #627 )
...
* VarBuilder cleanup.
* Implement the basic varbuilders.
* Add the sharded code.
* Proper support for tensor sharding.
2023-08-27 18:03:26 +01:00
c105550405
s/panic/bail/
2023-08-25 18:05:07 +02:00
4826a4212e
Adding support for codellama in examples.
...
Codellama requires bf16 for now (error to convert from bf16 to f16).
Multiprocess demo not functional for it because flash-attn only supports
f16 for now.
2023-08-25 09:56:11 +00:00
c78ce76501
Add a simple Module trait and implement it for the various nn layers ( #500 )
...
* Start adding the module trait.
* Use the module trait.
* Implement module for qmatmul.
2023-08-18 09:38:22 +01:00
13401df4d1
Add an abstract type for RmsNorm. ( #499 )
2023-08-18 08:52:14 +01:00
03be33eea4
Relax the requirements on CustomOp. ( #486 )
...
* Relax the requirements on CustomOp.
* Simplify the custom-ops when no backward is required.
2023-08-17 11:12:05 +01:00
d32e8199cd
Layer norm tweaks ( #482 )
...
* Add some options to make layer-norm more configurable.
* Add the rms-norm variant.
* Replace the RmsNorm with the shared bits.
2023-08-17 10:07:13 +01:00
906c0f3eb5
Remove the checkpoint conversion script. ( #405 )
...
* Remove the checkpoint conversion script.
* Remove references to the script.
2023-08-11 05:59:48 +01:00
51e51da896
Rename the candle crate to candle-core ( #301 )
...
* Rename to candle-core.
* More candle-core renaming.
2023-08-02 08:20:22 +01:00
97d8712ba5
Remove single function.
2023-07-28 23:31:25 +02:00
97181a77c0
Making multiprocess require flash-attn.
2023-07-28 23:31:24 +02:00
7513a5e005
Line-up the llama implementation with the python-transformers one. ( #271 )
...
* Line-up the llama implementation with the python-transformers one.
* Also lineup the multiprocess version.
2023-07-28 18:31:28 +01:00
3eb2bc6d07
Softmax numerical stability. ( #267 )
...
* Softmax numerical stability.
* Fix the flash-attn test.
2023-07-28 13:13:01 +01:00
25a2086e8f
Putting back Send + Sync
2023-07-27 09:58:47 +02:00
7c7e6ba201
Removing inner dependency on safetensors.
2023-07-27 09:58:47 +02:00
ed58de7551
Fixed TP sharded version.
2023-07-27 09:58:46 +02:00
1735e4831e
TP sharding v2
2023-07-27 09:58:14 +02:00