03348e2e6f
Update mistral README.md ( #995 )
2023-09-29 12:24:32 +01:00
49fa184a35
Mistral readme ( #994 )
...
* Mistral: print the generated text.
* Add mistral to the readmes.
2023-09-29 11:50:50 +01:00
6f17ef82be
Mistral: print the generated text. ( #992 )
2023-09-29 10:56:11 +01:00
ada8851a23
Add the mistral example. ( #984 )
...
* Add the mistral example.
* Use the two model files.
* Adjust the dtype.
* Tweak the weight paths.
* Remove the end of text token.
* Get the mistral model to generate some text.
2023-09-28 16:19:18 +01:00
2dd43d6cdd
add eos token to phi example ( #965 )
...
* add eos token to phi example
* rustfmt + get the token directly.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-09-26 09:21:22 +01:00
c78a294323
Add some repeat penalty to the phi example. ( #961 )
2023-09-25 20:53:30 +01:00
a36d883254
Use a single flag for the point argument. ( #958 )
2023-09-25 12:53:24 +01:00
7f2bbcf746
[segment-anything] Support multi-point as the prompt input ( #945 )
...
* [sam] Support multi-point prompts
* [segment-anything] Pass points by reference
* [segment-anything] Update example code and image
* Fix clippy lint.
---------
Co-authored-by: Yun Ding <yunding@nvidia.com >
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-09-25 12:14:10 +01:00
1ce7fe2543
Add more examples to the phi readme. ( #956 )
2023-09-24 18:19:05 +01:00
f5069dd354
Use the repo for the quantized phi model. ( #954 )
2023-09-24 16:30:26 +01:00
0007ae9c11
Add the quantized mixformer model. ( #953 )
...
* Add the quantized mixformer model.
* Add the quantized option in the phi example.
2023-09-24 15:03:48 +01:00
bcb0ed8f1c
Self-contained safetensors for the multiprocess llama example. ( #950 )
2023-09-24 06:54:49 +01:00
bb3471ea31
Adapt more examples to the updated safetensor api. ( #947 )
...
* Simplify the safetensor usage.
* Convert more examples.
* Move more examples.
* Adapt stable-diffusion.
2023-09-23 21:26:03 +01:00
890d069092
Self-contained safetensor wrappers ( #946 )
...
* Self-contained safetensor wrappers.
* Use the new safetensor container in varbuilders.
2023-09-23 20:39:52 +01:00
5dbe46b389
Add tracing. ( #943 )
2023-09-23 16:55:46 +01:00
b54acfa3d0
Tracing for the phi model ( #936 )
...
* Add some tracing bits to mixformers.
* Add the missing file.
* Add the conv2d layer to with-tracing.
* Improve the tracing usage.
2023-09-23 09:19:34 +01:00
912a3d63b0
Use the proper block size for quantizing models. ( #933 )
...
* Use the proper block size for quantizing models.
* Use the proper dimension.
2023-09-22 21:36:56 +01:00
df6f5240ba
Complete the mixformer implementation. ( #930 )
...
* Complete the mixformers implementation.
* Tweak the attention.
* Add the phi-1.5 example.
* Improve the phi example.
* Bugfix.
* Get the phi example to work.
2023-09-22 20:03:16 +01:00
aa8ec06fd2
Add the t5-xxl version. ( #924 )
2023-09-21 14:48:13 +01:00
b43ca493f6
Add more quantized flan t5 variants ( #923 )
...
* Add the quantized flan-t5-large variant.
* Add more sizes.
2023-09-21 13:23:30 +01:00
3b557765e8
T5 quantized example ( #922 )
...
* Load gguf files for the quantized t5.
* Add the quantized t5 example.
* Allow for loading local files.
* Add some support for quantizing safetensor files.
* Transpose before quantizing.
* Quantized t5.
* Retrieve the weights from the hub.
2023-09-21 12:33:15 +01:00
9b24d89d2d
Tracing mode for T5. ( #913 )
...
* Tracing mode for T5.
* Tracing for the linear layer.
2023-09-20 15:03:35 +01:00
fb1c2ac535
Add flash-attn support. ( #912 )
...
* Add flash-attn support.
* Add the use-flash-attn flag.
* Re-enable flash-attn.
2023-09-20 14:07:55 +01:00
728e167334
Add details on wuerstchen. ( #911 )
2023-09-20 13:09:35 +01:00
c0b49d5a50
Wuerstchen parameter tweaks. ( #907 )
2023-09-20 09:26:24 +01:00
098dd0d1e9
fix: add missingtop_p
in llama_multiprocess ( #905 )
2023-09-20 08:54:56 +01:00
67a486d18d
Line-up the wuerstchen model with the python implementation. ( #901 )
...
* Line-up the wuerstchen model with the python implementation.
* Missing cos.
* Fix the picture denormalization.
2023-09-19 21:59:44 +01:00
8696f64bae
Fix T5 kv cache ( #899 )
...
* Fix T5 kv cache
* Add argument for decoder prompt
* Fix range
2023-09-19 20:36:15 +01:00
4f91c8e109
Improve the error message on shape mismatch for cat. ( #897 )
...
* Improve the error message on shape mismatch for cat.
* Cosmetic tweak.
2023-09-19 15:09:47 +01:00
06e46d7c3b
Only use classifier free guidance for the prior. ( #896 )
...
* Only use classifier free guidance for the prior.
* Add another specific layer-norm structure.
* Tweaks.
* Fix the latent shape.
* Print the prior shape.
* More shape fixes.
* Remove some debugging continue.
2023-09-19 14:13:05 +01:00
aaa9d4ed6c
W decoding. ( #893 )
...
* W decoding.
* Add the diffusion loop.
* Use the appropriate config.
2023-09-19 07:13:44 +01:00
1542e92629
T5: Add option to override use_cache from config ( #892 )
...
* Add option to override use_cache from config
* Disable cache by default and cleanup code
2023-09-18 20:20:21 +01:00
82a98f6da0
Prior denoising. ( #889 )
2023-09-18 16:51:38 +01:00
5082954c52
Fix the W clip embeddings. ( #887 )
...
* Fix the W clip embeddings.
* Add the specialized ddpm scheduler.
2023-09-18 14:50:14 +01:00
7dd8e12472
Bump the crate versions to v0.2.3. ( #886 )
...
* Bump the crate version.
* Also update the python bindings.
2023-09-18 12:14:03 +01:00
c2b866172a
More Wuerstchen fixes. ( #882 )
...
* More Weurstchen fixes.
* More shape fixes.
* Add more of the prior specific bits.
* Broadcast add.
* Fix the clip config.
* Add some masking options to the clip model.
2023-09-17 22:08:11 +01:00
5f83c13f17
Add the DDPM scheduler. ( #877 )
...
* Add the DDPM scheduler.
* Minor tweaks.
2023-09-17 15:03:01 +01:00
db3e9dae04
Wuerstchen main ( #876 )
...
* Wuerstchen main.
* More of the wuerstchen cli example.
* Paella creation.
* Build the prior model.
* Fix the weight file names.
2023-09-17 12:46:38 +01:00
7f65af1f0d
Avoid re-encoding the input in the T5 example. ( #875 )
2023-09-17 10:25:54 +01:00
eeb54716dd
Tweaks for the T5 example. ( #874 )
2023-09-17 10:05:15 +01:00
1a276b5da7
Add a KV cache to T5. ( #873 )
...
* Add a KV cache to T5.
* Suggest using release mode.
* Use the kv cache in decoding.
* Add a comment.
2023-09-17 08:00:45 +01:00
3e49f8fce5
Implement T5 decoding ( #864 )
...
* Load t5 decoder
* Run enc, dec, and lm head, but no cross attn
* Cross-attention over key_value_states
* New arg for decoder input ids
* Add mask, don't forward position biases through decoder
* Update t5 examples
* Clippy + rustfmt
2023-09-15 22:05:12 +02:00
31ab2ddaeb
Remove the padding. ( #838 )
2023-09-13 13:00:59 +01:00
3e94324012
Add some sentence similarity part to the t5 example. ( #835 )
...
* Add some sentence similarity part to the t5 example.
* Clippy fix.
2023-09-13 10:44:02 +01:00
e6f040d6e3
Readme gallery ( #834 )
...
* More readme tweaks.
* Update README.md
2023-09-13 09:05:47 +01:00
cbd36157ac
Add a gif to the quantized readme. ( #833 )
...
* Add a gif to the quantized readme.
* gif update.
2023-09-13 08:43:52 +01:00
e4553fb355
T5 tweaks ( #831 )
...
* Use default values rather than options.
* Avoid exposing the device field.
* More tweaks.
2023-09-13 07:37:04 +01:00
9daa6dbe87
Extract T5 module and add main function to use it ( #829 )
...
* Extract t5 out of musicgen
* Add main for t5 module
2023-09-13 07:14:05 +01:00
e82fcf1c59
Add more example readmes. ( #828 )
...
* Add more readmes.
* Add a readme for dinov2.
* Add some skeleton files for a couple more examples.
* More whisper details.
2023-09-12 17:21:24 +01:00
805bf9ffa7
Implement top_p / nucleus sampling ( #819 )
...
* Implement top_p / nucleus sampling
* Update changelog
* rustfmt
* Add tests
* Fix clippy warning
* Fix another clippy error
2023-09-12 18:10:16 +02:00