9c4b4f0da0
Metal quantized modifications proposal.
...
- Add a device param, wherever needed.
- Create new QMetal storage thing that implements QuantizedType.
- Update everywhere needed.
Fix Python.
Fixing examples.
Fix: fmt + clippy + stub.
Moving everything around.
Only missing the actual implems.
Fixing everything + adding dequantized kernels.
More work.
Fixing matmul.
Fmt + Clippy
Some clippy fixes.
Working state.
Q2K Metal -> Bugged (also present in GGML).
Q4K CPU -> Bugged (present previously, new test catch it).
Q5K CPU -> Bugged (present previously).
Q8_1 Both -> Never really implemented it seems
Q8K metal -> Never implemented in metal
Fixing Q2K bug (present in ggml).
2024-01-15 17:42:58 +01:00
1f1179913a
Update gloo requirement from 0.8 to 0.11 ( #1558 )
...
Updates the requirements on [gloo](https://github.com/rustwasm/gloo ) to permit the latest version.
- [Release notes](https://github.com/rustwasm/gloo/releases )
- [Changelog](https://github.com/rustwasm/gloo/blob/master/CHANGELOG.md )
- [Commits](https://github.com/rustwasm/gloo/commits )
---
updated-dependencies:
- dependency-name: gloo
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-10 16:27:20 +01:00
b4cb982e49
Simplifying our internal cargo dependencies. ( #1529 )
2024-01-07 12:04:14 +01:00
d35f0a1376
Bump the crate version to 0.3.3. ( #1490 )
2023-12-28 13:38:30 +01:00
94817dac56
Bump the crate version to 0.3.2. ( #1452 )
2023-12-17 05:34:53 -06:00
a209ce8ceb
Update for 0.3.1. ( #1324 )
2023-11-11 18:48:52 +00:00
b86ac0c507
Quant t5: Add coedit model to wasm demo and readme ( #1031 )
2023-10-04 20:57:33 +01:00
096dee7073
Bump the version to 0.3.0. ( #1014 )
...
* Bump the version to 0.3.0.
* Changelog update.
2023-10-01 13:51:57 +01:00
a084f65f9a
fix rep penalty min value ( #963 )
2023-09-26 05:23:50 +01:00
7edd755756
Pass directly the buffer ownership. ( #949 )
2023-09-24 06:34:44 +01:00
cda1786eed
smaller t5 models quantized ( #934 )
2023-09-22 22:31:23 +01:00
19e52e5007
T5 Wasm ( #918 )
...
* init t5 wasm model
* split workers for each model
* clean up
* add some ui
* readme
* index
* typo
* remove cache param, clear_kv_cache
* add max_length as param
* add model tasks option to ui
* add method to load quantized gguf from buffer
* Add quantized wasm module
* add quantized models to UI, dynamic import wasms
* link to quantized
* fix copy
* fix ModelEncoder
* fix README.md
2023-09-22 15:31:10 +01:00