c05a348e36
Add the Mistral 7b model ( #983 )
...
* Start sketching the mistral 7b model.
* Add the kv cache.
* Add the decoder layer.
* Add the mistral model.
* Rotary embeddings.
* Add the attention mask.
2023-09-28 14:29:41 +01:00
25657804ef
Simd128 q2k vecdot ( #982 )
...
* Sketch the simd128 version of q2k vecdot.
* Use a single accumulator.
2023-09-28 12:16:35 +01:00
5e1c595e00
Optimize the index-select cuda kernel. ( #976 )
2023-09-28 09:05:29 +01:00
8a49e01b9d
Add the remaining quantized tests to the wasm suite. ( #980 )
2023-09-28 08:42:56 +01:00
9cb110c44c
Sketch a simd128 optimized q4k vecdot. ( #977 )
...
* Sketch a simd128 optimized q4k vecdot.
* Simdify.
* More quantization optimizations.
* Again more simdification.
* Simdify the splitting loop.
2023-09-27 20:19:38 +01:00
667f01c173
Simd128 vec-dot for q4_0. ( #974 )
...
* Simd128 vec-dot for q4_0.
* Bugfix.
* Add wasm tests.
* Bugfix for the q40 vecdot.
* More quantization tests.
2023-09-27 14:15:30 +01:00
e59784e353
simd128 optimized q8_0 vecdot ( #972 )
...
* wasm/simd128 version of the quantized q8_0 vecdot.
* Add the missing conversion.
2023-09-27 11:03:20 +01:00
29bd6b2979
Phi 1.5 wasm module ( #966 )
...
* add phi wasm module
* replace input with textarea
* trim input prompt
* stop on <|endoftext|>
* formatting
* clean up
* add blurb, and syntax highlighting
* add phi-v1.5 wasm
* add note
* hide Options on details
* add first token to generated text
* whitespaces for new line
* fix: abort -> aborted
2023-09-27 06:07:11 +01:00
9571b200c9
fix firstToken, minor ui changes ( #971 )
2023-09-27 06:01:59 +01:00
ce0a4e3a85
Use the gelu-erf activation. ( #969 )
2023-09-26 22:30:21 +01:00
4abc1ea34d
Avoid some overflows on wasm32. ( #968 )
2023-09-26 11:15:38 +01:00
2dd43d6cdd
add eos token to phi example ( #965 )
...
* add eos token to phi example
* rustfmt + get the token directly.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-09-26 09:21:22 +01:00
1fcac4afed
Expose a function to clear the KV cache on mixformers. ( #964 )
2023-09-26 05:41:07 +01:00
a084f65f9a
fix rep penalty min value ( #963 )
2023-09-26 05:23:50 +01:00
c798184c2b
Configurable layer idx for the lstm layer. ( #962 )
2023-09-25 21:31:14 +01:00
c78a294323
Add some repeat penalty to the phi example. ( #961 )
2023-09-25 20:53:30 +01:00
a36d883254
Use a single flag for the point argument. ( #958 )
2023-09-25 12:53:24 +01:00
7f2bbcf746
[segment-anything] Support multi-point as the prompt input ( #945 )
...
* [sam] Support multi-point prompts
* [segment-anything] Pass points by reference
* [segment-anything] Update example code and image
* Fix clippy lint.
---------
Co-authored-by: Yun Ding <yunding@nvidia.com >
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-09-25 12:14:10 +01:00
dc47224ab9
Override the default cudnn heuristics. ( #957 )
2023-09-25 10:31:53 +01:00
1ce7fe2543
Add more examples to the phi readme. ( #956 )
2023-09-24 18:19:05 +01:00
402ddcfcb4
Add the missing kernel. ( #955 )
2023-09-24 17:21:37 +01:00
f5069dd354
Use the repo for the quantized phi model. ( #954 )
2023-09-24 16:30:26 +01:00
0007ae9c11
Add the quantized mixformer model. ( #953 )
...
* Add the quantized mixformer model.
* Add the quantized option in the phi example.
2023-09-24 15:03:48 +01:00
e15862cfdb
Shared the quantized var-builder code. ( #952 )
...
* Shared the quantized var-builder code.
* Fix compilation.
2023-09-24 12:55:07 +01:00
4aeb449017
Depreate the VarBuilder::from_safetensors function. ( #951 )
2023-09-24 11:18:17 +01:00
bcb0ed8f1c
Self-contained safetensors for the multiprocess llama example. ( #950 )
2023-09-24 06:54:49 +01:00
7edd755756
Pass directly the buffer ownership. ( #949 )
2023-09-24 06:34:44 +01:00
e32c89d90c
Add the buffered safetensor wrapper. ( #948 )
2023-09-23 22:57:42 +01:00
bb3471ea31
Adapt more examples to the updated safetensor api. ( #947 )
...
* Simplify the safetensor usage.
* Convert more examples.
* Move more examples.
* Adapt stable-diffusion.
2023-09-23 21:26:03 +01:00
890d069092
Self-contained safetensor wrappers ( #946 )
...
* Self-contained safetensor wrappers.
* Use the new safetensor container in varbuilders.
2023-09-23 20:39:52 +01:00
5dbe46b389
Add tracing. ( #943 )
2023-09-23 16:55:46 +01:00
ccf352f3d1
Use yoke to provide a self-referential container for mmaped safetenso… ( #939 )
...
* Use yoke to provide a self-referential container for mmaped safetensor files.
* Add the new self-owned type for safetensor files without removing the previous version.
* Add routing.
* Add an initializer for the case of multiple files.
2023-09-23 15:43:11 +01:00
402d207f0f
VarMap setter functions ( #938 )
...
* Add some setter helper functions for varmap.
* Add more comments.
2023-09-23 10:27:51 +01:00
7582937a32
Add the causal mask in mixformer. ( #937 )
2023-09-23 09:50:26 +01:00
b54acfa3d0
Tracing for the phi model ( #936 )
...
* Add some tracing bits to mixformers.
* Add the missing file.
* Add the conv2d layer to with-tracing.
* Improve the tracing usage.
2023-09-23 09:19:34 +01:00
cda1786eed
smaller t5 models quantized ( #934 )
2023-09-22 22:31:23 +01:00
912a3d63b0
Use the proper block size for quantizing models. ( #933 )
...
* Use the proper block size for quantizing models.
* Use the proper dimension.
2023-09-22 21:36:56 +01:00
3ef328c53d
Mention the new phi model in the readme. ( #932 )
2023-09-22 21:24:51 +01:00
0c8e983514
update link to t5 ( #931 )
2023-09-22 20:30:01 +01:00
df6f5240ba
Complete the mixformer implementation. ( #930 )
...
* Complete the mixformers implementation.
* Tweak the attention.
* Add the phi-1.5 example.
* Improve the phi example.
* Bugfix.
* Get the phi example to work.
2023-09-22 20:03:16 +01:00
a46b1b4657
Mixformer ( #929 )
...
* Sketch the mixformer model.
* More modeling code.
* More mixformers.
* MixFormer creation.
* More mixformers.
2023-09-22 16:17:14 +01:00
19e52e5007
T5 Wasm ( #918 )
...
* init t5 wasm model
* split workers for each model
* clean up
* add some ui
* readme
* index
* typo
* remove cache param, clear_kv_cache
* add max_length as param
* add model tasks option to ui
* add method to load quantized gguf from buffer
* Add quantized wasm module
* add quantized models to UI, dynamic import wasms
* link to quantized
* fix copy
* fix ModelEncoder
* fix README.md
2023-09-22 15:31:10 +01:00
8601537e31
Add slice-scatter. ( #927 )
...
* Add slice-scatter.
* Add the op.
* Make transpose be a no-op when the dimensions are identical.
* Add the backprop.
* And add some gradient test.
2023-09-22 12:18:16 +01:00
4ac6039a42
Merge branch 'main' into book-trainin-simplified
2023-09-22 11:01:23 +06:00
52a60ca3ad
https://github.com/huggingface/candle/issues/637
2023-09-22 10:57:11 +06:00
a96878f235
cuda cast i64 ( #925 )
2023-09-21 19:52:39 +01:00
aa8ec06fd2
Add the t5-xxl version. ( #924 )
2023-09-21 14:48:13 +01:00
b43ca493f6
Add more quantized flan t5 variants ( #923 )
...
* Add the quantized flan-t5-large variant.
* Add more sizes.
2023-09-21 13:23:30 +01:00
3b557765e8
T5 quantized example ( #922 )
...
* Load gguf files for the quantized t5.
* Add the quantized t5 example.
* Allow for loading local files.
* Add some support for quantizing safetensor files.
* Transpose before quantizing.
* Quantized t5.
* Retrieve the weights from the hub.
2023-09-21 12:33:15 +01:00
2619c4307f
Add a quantized version of the t5 model. ( #921 )
2023-09-21 11:13:39 +01:00