904bbdae65
Make the Python Wrapper more Hackable and simplify Quantization ( #1010 )
...
* Some first `Module` implementations
* Add `state_dict` and `load_state_dict` functionality
* Move modules around and create `candle.nn.Linear`
* Add `nn.Embedding` and `nn.LayerNorm`
* Add BERT implementation
* Batch q-matmul
* Automatically dequantize `QTensors` if a `Tensor` is expected
* Add Module `.to()`, `.cuda()`, `cpu()` and `.type()` functionality
* Unittests for `Module`, `Tensor` and `candle.utils`
* Add `pytorch` like slicing to `Tensor`
* Cleanup and BERT fixes
* `black` formatting + unit-test for `nn.Linear`
* Refactor slicing implementation
2023-10-06 19:01:07 +01:00
b0442eff8a
Sketch the stable-lm model. ( #1045 )
2023-10-06 18:19:06 +01:00
4631c48273
Remove some todos. ( #1042 )
2023-10-05 22:42:20 +01:00
716883e9b0
Add the clamping for stable-diffusion. ( #1041 )
2023-10-05 22:20:39 +01:00
47c25a567b
feat: [SAM] able to download the result as png ( #1035 )
...
* feat: able to download the result as png
* feat: update function and wording
2023-10-05 22:14:47 +01:00
7f7d95e2c3
Add the round-to function. ( #1039 )
2023-10-05 20:28:09 +01:00
f47bd9bab5
Delete invalid comment ( #1038 )
2023-10-05 19:28:08 +01:00
8f7973958c
fix: fix index_select cuda kernel for src target dim different than ids dim when selecting dim > 0 ( #1037 )
...
* fix: fix index_select cuda kernel for src target dim different than ids dim when selecting dim > 0
* cargo fmt
2023-10-05 18:46:13 +01:00
f0c619a4af
Use AsRef<str> for set_one. ( #1033 )
2023-10-05 06:05:44 +01:00
b86ac0c507
Quant t5: Add coedit model to wasm demo and readme ( #1031 )
2023-10-04 20:57:33 +01:00
27e70a5093
Whisper quantized wasm ( #1028 )
...
* [Whisper] Update to use quantized model
* [whisper] add language detection
* [whisper] change assets location
* [whisper] adapt js example with quantized models
* [whisper] better task parsing
* [whisper] minor fixes
2023-10-04 20:22:57 +01:00
c18a856e76
Add the rounding operators. ( #1030 )
...
* Add the rounding operators.
* Avoid tracking gradients for the rounding operations.
* Add some rounding tests.
2023-10-04 17:58:44 +01:00
3349c89252
Add quantized t5 args for weight and config ( #1029 )
2023-10-04 17:02:49 +01:00
11d3687cc6
Simd128 optimized q8k vecdot. ( #1026 )
2023-10-03 15:29:48 +01:00
dac73edb34
AVX optimized q8k vecdot. ( #1024 )
2023-10-03 12:10:58 +01:00
b4da19d1be
Merge pull request #1023 from evgenyigumnov/simlified-book-polish
...
small misspeling and polish fix
2023-10-03 12:29:41 +02:00
ff513314fc
small misspeling and polish fix
2023-10-03 15:47:04 +06:00
043cc25766
Fix for the index-select cuda setup. ( #1022 )
...
* Fix for index-select.
* Better fix + add some testing.
2023-10-03 10:21:46 +01:00
7b06872f90
Merge pull request #926 from evgenyigumnov/book-trainin-simplified
...
Book train simlified example
2023-10-03 10:41:30 +02:00
65825e7240
[SAM] Add undo button and background point mode ( #1020 )
...
* [SAM] Add undo button and background point mode
* [SAM] remove pts on near clicks
* [SAM] check shiftKey toggle point mode
* [SAM] clear points when clearing image
2023-10-02 23:33:46 +01:00
7670fe7d1f
neon optimized q8k multiplication. ( #1021 )
...
* neon optimized q8k multiplication.
* Bugfixes.
* simdification.
2023-10-02 23:26:34 +01:00
cddfc3944c
Add the q8k vec-dot multiplication. ( #1019 )
2023-10-02 21:53:34 +01:00
089fc3b584
Improve the quantized whisper setup. ( #1018 )
...
* Improve the quantized whisper setup.
* Fix the config file paths.
* Use the standard matmul where possible.
2023-10-02 17:17:46 +01:00
e04c789230
Add a quantized variant of whisper ( #1017 )
...
* Add the quantized-whisper model.
* Quantized the whisper model.
* Adapt the whisper example to handle quantization.
* Add the quantized flag.
* Load the proper weights.
2023-10-02 14:59:53 +01:00
263a172202
Improve the testing of the optimized quantized vec-dot ops ( #1016 )
...
* Expose the unopt functions for testing.
* Better testing of the optimized quantized computations.
2023-10-02 09:50:43 +01:00
638ccf9f46
Fix include code.
2023-10-02 10:22:44 +02:00
0baf5a1e19
Fixed PR warnings.
2023-10-02 10:15:10 +02:00
5130a7da32
Simd128 version of q6k vec-dot. ( #1015 )
...
* Add a specific function for the simd128 q6k vec-dot.
* Simdification.
* More simdification.
2023-10-01 19:44:12 +01:00
41143db1af
[segment-anything] add multi point logic for demo site ( #1002 )
...
* [segment-anything] add multi point logic for demo site
* [segment-anything] remove libs and update functions
2023-10-01 18:25:22 +01:00
096dee7073
Bump the version to 0.3.0. ( #1014 )
...
* Bump the version to 0.3.0.
* Changelog update.
2023-10-01 13:51:57 +01:00
f6054e9d60
Fix the prompt for mistral when using instruct/interactive mode. ( #1013 )
2023-10-01 06:44:30 +01:00
328167ec04
Integrate TheBloke quantized mistral weights. ( #1012 )
2023-09-30 22:39:42 +01:00
4e55aaa51f
Simd128 version of the q2k-q8k vecdot product. ( #1011 )
...
* Sketch the simd128 version of q2k vecdot.
* Use a single accumulator.
* Simdify the q2k-q8k vecdot product.
* Cosmetic change.
2023-09-30 20:12:41 +01:00
deee7612da
Quantized version of mistral. ( #1009 )
...
* Quantized version of mistral.
* Integrate the quantized mistral variant.
* Use the quantized weight files.
* Tweak the quantization command.
* Fix the dtype when computing the rotary embeddings.
* Update the readme with the quantized version.
* Fix the decoding of the remaining tokens.
2023-09-30 18:25:47 +01:00
06207332bc
Streaming mode for reporting the generated tokens ( #1007 )
...
* Token streaming.
* Use the token output stream.
* Flush the output.
* Ensure that the last characters get reported.
2023-09-30 15:04:11 +01:00
4021272875
Use flash-attn for mistral. ( #1004 )
2023-09-30 12:15:10 +01:00
87e3a4e175
Mistral: exit on eos token. ( #1001 )
...
* Mistral: exit on eos token.
* Print the proper stats.
* Also add a short flag.
2023-09-30 07:07:06 +01:00
6203ced495
Add negative prompts to segment-anything. ( #1000 )
2023-09-30 06:17:42 +01:00
34842fb234
[segment-anything] Print IOU values to help with debugging ( #999 )
2023-09-30 05:44:42 +01:00
d188d6a764
Fix the multiple points case for sam. ( #998 )
2023-09-29 22:39:43 +02:00
0ac2db577b
Add an entry about WSL slowness to the faq. ( #997 )
2023-09-29 17:04:52 +01:00
fc59bc31bf
fix: add missing gpu fill_* ( #996 )
2023-09-29 15:49:30 +01:00
03348e2e6f
Update mistral README.md ( #995 )
2023-09-29 12:24:32 +01:00
49fa184a35
Mistral readme ( #994 )
...
* Mistral: print the generated text.
* Add mistral to the readmes.
2023-09-29 11:50:50 +01:00
6f17ef82be
Mistral: print the generated text. ( #992 )
2023-09-29 10:56:11 +01:00
01b92cd959
fixes slice_scatter dim type ( #988 )
2023-09-29 07:54:45 +01:00
53510ce427
Use a silu activation in mistral. ( #991 )
2023-09-29 07:06:54 +01:00
23b3576c47
Add the sliding window. ( #986 )
2023-09-28 17:26:33 +01:00
716ab2ccdc
Mistral gpu fix ( #985 )
...
* Add the mistral example.
* Use the two model files.
* Adjust the dtype.
* Tweak the weight paths.
* Remove the end of text token.
* Get the mistral model to generate some text.
* Fix when running on the gpu.
* More gpu fixes.
2023-09-28 16:38:13 +01:00
ada8851a23
Add the mistral example. ( #984 )
...
* Add the mistral example.
* Use the two model files.
* Adjust the dtype.
* Tweak the weight paths.
* Remove the end of text token.
* Get the mistral model to generate some text.
2023-09-28 16:19:18 +01:00