69fdcfe96a
Apply rustfmt. ( #2421 )
2024-08-16 18:57:14 +02:00
2b75dd9551
Fix build issue in EOS Token in llama-multiprocess ( #2420 )
2024-08-16 18:46:31 +02:00
587ee3bb6f
Small cleanups to the llama multi-process example. ( #2098 )
2024-04-20 22:19:46 +02:00
8b390ddd29
Only download the weights in the main process (and not in the child processes). ( #2093 )
2024-04-20 13:01:23 +02:00
c97d639fa0
Multiprocess/multi-GPU support for llama 3. ( #2092 )
...
* Multiprocess/multi-GPU support for llama 3.
* Modernize the mp example a bit.
2024-04-20 12:49:21 +02:00
3071ea6c3e
Use the new hub helper function. ( #1484 )
2023-12-26 09:44:30 +01:00
bcb0ed8f1c
Self-contained safetensors for the multiprocess llama example. ( #950 )
2023-09-24 06:54:49 +01:00
098dd0d1e9
fix: add missingtop_p
in llama_multiprocess ( #905 )
2023-09-20 08:54:56 +01:00
4c338b0cd9
VarBuilder cleanup ( #627 )
...
* VarBuilder cleanup.
* Implement the basic varbuilders.
* Add the sharded code.
* Proper support for tensor sharding.
2023-08-27 18:03:26 +01:00
c105550405
s/panic/bail/
2023-08-25 18:05:07 +02:00
4826a4212e
Adding support for codellama in examples.
...
Codellama requires bf16 for now (error to convert from bf16 to f16).
Multiprocess demo not functional for it because flash-attn only supports
f16 for now.
2023-08-25 09:56:11 +00:00
c78ce76501
Add a simple Module trait and implement it for the various nn layers ( #500 )
...
* Start adding the module trait.
* Use the module trait.
* Implement module for qmatmul.
2023-08-18 09:38:22 +01:00
13401df4d1
Add an abstract type for RmsNorm. ( #499 )
2023-08-18 08:52:14 +01:00
03be33eea4
Relax the requirements on CustomOp. ( #486 )
...
* Relax the requirements on CustomOp.
* Simplify the custom-ops when no backward is required.
2023-08-17 11:12:05 +01:00
d32e8199cd
Layer norm tweaks ( #482 )
...
* Add some options to make layer-norm more configurable.
* Add the rms-norm variant.
* Replace the RmsNorm with the shared bits.
2023-08-17 10:07:13 +01:00
906c0f3eb5
Remove the checkpoint conversion script. ( #405 )
...
* Remove the checkpoint conversion script.
* Remove references to the script.
2023-08-11 05:59:48 +01:00
51e51da896
Rename the candle crate to candle-core ( #301 )
...
* Rename to candle-core.
* More candle-core renaming.
2023-08-02 08:20:22 +01:00
97d8712ba5
Remove single function.
2023-07-28 23:31:25 +02:00
97181a77c0
Making multiprocess require flash-attn.
2023-07-28 23:31:24 +02:00
7513a5e005
Line-up the llama implementation with the python-transformers one. ( #271 )
...
* Line-up the llama implementation with the python-transformers one.
* Also lineup the multiprocess version.
2023-07-28 18:31:28 +01:00
3eb2bc6d07
Softmax numerical stability. ( #267 )
...
* Softmax numerical stability.
* Fix the flash-attn test.
2023-07-28 13:13:01 +01:00
25a2086e8f
Putting back Send + Sync
2023-07-27 09:58:47 +02:00
7c7e6ba201
Removing inner dependency on safetensors.
2023-07-27 09:58:47 +02:00
ed58de7551
Fixed TP sharded version.
2023-07-27 09:58:46 +02:00
1735e4831e
TP sharding v2
2023-07-27 09:58:14 +02:00