Commit Graph

1417 Commits

Author SHA1 Message Date
ff513314fc small misspeling and polish fix 2023-10-03 15:47:04 +06:00
043cc25766 Fix for the index-select cuda setup. (#1022)
* Fix for index-select.

* Better fix + add some testing.
2023-10-03 10:21:46 +01:00
7b06872f90 Merge pull request #926 from evgenyigumnov/book-trainin-simplified
Book train simlified example
2023-10-03 10:41:30 +02:00
65825e7240 [SAM] Add undo button and background point mode (#1020)
* [SAM] Add undo button and background point mode

* [SAM] remove pts on near clicks

* [SAM] check shiftKey toggle point mode

* [SAM] clear points when clearing image
2023-10-02 23:33:46 +01:00
7670fe7d1f neon optimized q8k multiplication. (#1021)
* neon optimized q8k multiplication.

* Bugfixes.

* simdification.
2023-10-02 23:26:34 +01:00
cddfc3944c Add the q8k vec-dot multiplication. (#1019) 2023-10-02 21:53:34 +01:00
089fc3b584 Improve the quantized whisper setup. (#1018)
* Improve the quantized whisper setup.

* Fix the config file paths.

* Use the standard matmul where possible.
2023-10-02 17:17:46 +01:00
e04c789230 Add a quantized variant of whisper (#1017)
* Add the quantized-whisper model.

* Quantized the whisper model.

* Adapt the whisper example to handle quantization.

* Add the quantized flag.

* Load the proper weights.
2023-10-02 14:59:53 +01:00
263a172202 Improve the testing of the optimized quantized vec-dot ops (#1016)
* Expose the unopt functions for testing.

* Better testing of the optimized quantized computations.
2023-10-02 09:50:43 +01:00
638ccf9f46 Fix include code. 2023-10-02 10:22:44 +02:00
0baf5a1e19 Fixed PR warnings. 2023-10-02 10:15:10 +02:00
5130a7da32 Simd128 version of q6k vec-dot. (#1015)
* Add a specific function for the simd128 q6k vec-dot.

* Simdification.

* More simdification.
2023-10-01 19:44:12 +01:00
41143db1af [segment-anything] add multi point logic for demo site (#1002)
* [segment-anything] add multi point logic for demo site

* [segment-anything] remove libs and update functions
2023-10-01 18:25:22 +01:00
096dee7073 Bump the version to 0.3.0. (#1014)
* Bump the version to 0.3.0.

* Changelog update.
2023-10-01 13:51:57 +01:00
f6054e9d60 Fix the prompt for mistral when using instruct/interactive mode. (#1013) 2023-10-01 06:44:30 +01:00
328167ec04 Integrate TheBloke quantized mistral weights. (#1012) 2023-09-30 22:39:42 +01:00
4e55aaa51f Simd128 version of the q2k-q8k vecdot product. (#1011)
* Sketch the simd128 version of q2k vecdot.

* Use a single accumulator.

* Simdify the q2k-q8k vecdot product.

* Cosmetic change.
2023-09-30 20:12:41 +01:00
deee7612da Quantized version of mistral. (#1009)
* Quantized version of mistral.

* Integrate the quantized mistral variant.

* Use the quantized weight files.

* Tweak the quantization command.

* Fix the dtype when computing the rotary embeddings.

* Update the readme with the quantized version.

* Fix the decoding of the remaining tokens.
2023-09-30 18:25:47 +01:00
06207332bc Streaming mode for reporting the generated tokens (#1007)
* Token streaming.

* Use the token output stream.

* Flush the output.

* Ensure that the last characters get reported.
2023-09-30 15:04:11 +01:00
4021272875 Use flash-attn for mistral. (#1004) 2023-09-30 12:15:10 +01:00
87e3a4e175 Mistral: exit on eos token. (#1001)
* Mistral: exit on eos token.

* Print the proper stats.

* Also add a short flag.
2023-09-30 07:07:06 +01:00
6203ced495 Add negative prompts to segment-anything. (#1000) 2023-09-30 06:17:42 +01:00
34842fb234 [segment-anything] Print IOU values to help with debugging (#999) 2023-09-30 05:44:42 +01:00
d188d6a764 Fix the multiple points case for sam. (#998) 2023-09-29 22:39:43 +02:00
0ac2db577b Add an entry about WSL slowness to the faq. (#997) 2023-09-29 17:04:52 +01:00
fc59bc31bf fix: add missing gpu fill_* (#996) 2023-09-29 15:49:30 +01:00
03348e2e6f Update mistral README.md (#995) 2023-09-29 12:24:32 +01:00
49fa184a35 Mistral readme (#994)
* Mistral: print the generated text.

* Add mistral to the readmes.
2023-09-29 11:50:50 +01:00
6f17ef82be Mistral: print the generated text. (#992) 2023-09-29 10:56:11 +01:00
01b92cd959 fixes slice_scatter dim type (#988) 2023-09-29 07:54:45 +01:00
53510ce427 Use a silu activation in mistral. (#991) 2023-09-29 07:06:54 +01:00
23b3576c47 Add the sliding window. (#986) 2023-09-28 17:26:33 +01:00
716ab2ccdc Mistral gpu fix (#985)
* Add the mistral example.

* Use the two model files.

* Adjust the dtype.

* Tweak the weight paths.

* Remove the end of text token.

* Get the mistral model to generate some text.

* Fix when running on the gpu.

* More gpu fixes.
2023-09-28 16:38:13 +01:00
ada8851a23 Add the mistral example. (#984)
* Add the mistral example.

* Use the two model files.

* Adjust the dtype.

* Tweak the weight paths.

* Remove the end of text token.

* Get the mistral model to generate some text.
2023-09-28 16:19:18 +01:00
c05a348e36 Add the Mistral 7b model (#983)
* Start sketching the mistral 7b model.

* Add the kv cache.

* Add the decoder layer.

* Add the mistral model.

* Rotary embeddings.

* Add the attention mask.
2023-09-28 14:29:41 +01:00
25657804ef Simd128 q2k vecdot (#982)
* Sketch the simd128 version of q2k vecdot.

* Use a single accumulator.
2023-09-28 12:16:35 +01:00
5e1c595e00 Optimize the index-select cuda kernel. (#976) 2023-09-28 09:05:29 +01:00
8a49e01b9d Add the remaining quantized tests to the wasm suite. (#980) 2023-09-28 08:42:56 +01:00
9cb110c44c Sketch a simd128 optimized q4k vecdot. (#977)
* Sketch a simd128 optimized q4k vecdot.

* Simdify.

* More quantization optimizations.

* Again more simdification.

* Simdify the splitting loop.
2023-09-27 20:19:38 +01:00
667f01c173 Simd128 vec-dot for q4_0. (#974)
* Simd128 vec-dot for q4_0.

* Bugfix.

* Add wasm tests.

* Bugfix for the q40 vecdot.

* More quantization tests.
2023-09-27 14:15:30 +01:00
e59784e353 simd128 optimized q8_0 vecdot (#972)
* wasm/simd128 version of the quantized q8_0 vecdot.

* Add the missing conversion.
2023-09-27 11:03:20 +01:00
29bd6b2979 Phi 1.5 wasm module (#966)
* add phi wasm module

* replace input with textarea

* trim input prompt

* stop on <|endoftext|>

* formatting

* clean up

* add blurb, and syntax highlighting

* add phi-v1.5 wasm

* add note

* hide Options on details

* add first token to generated text

* whitespaces for new line

* fix: abort -> aborted
2023-09-27 06:07:11 +01:00
9571b200c9 fix firstToken, minor ui changes (#971) 2023-09-27 06:01:59 +01:00
ce0a4e3a85 Use the gelu-erf activation. (#969) 2023-09-26 22:30:21 +01:00
4abc1ea34d Avoid some overflows on wasm32. (#968) 2023-09-26 11:15:38 +01:00
2dd43d6cdd add eos token to phi example (#965)
* add eos token to phi example

* rustfmt + get the token directly.

---------

Co-authored-by: laurent <laurent.mazare@gmail.com>
2023-09-26 09:21:22 +01:00
1fcac4afed Expose a function to clear the KV cache on mixformers. (#964) 2023-09-26 05:41:07 +01:00
a084f65f9a fix rep penalty min value (#963) 2023-09-26 05:23:50 +01:00
c798184c2b Configurable layer idx for the lstm layer. (#962) 2023-09-25 21:31:14 +01:00
c78a294323 Add some repeat penalty to the phi example. (#961) 2023-09-25 20:53:30 +01:00