mirror of
https://github.com/huggingface/candle.git
synced 2025-06-18 19:47:12 +00:00

- Add a device param, wherever needed. - Create new QMetal storage thing that implements QuantizedType. - Update everywhere needed. Fix Python. Fixing examples. Fix: fmt + clippy + stub. Moving everything around. Only missing the actual implems. Fixing everything + adding dequantized kernels. More work. Fixing matmul. Fmt + Clippy Some clippy fixes. Working state. Q2K Metal -> Bugged (also present in GGML). Q4K CPU -> Bugged (present previously, new test catch it). Q5K CPU -> Bugged (present previously). Q8_1 Both -> Never really implemented it seems Q8K metal -> Never implemented in metal Fixing Q2K bug (present in ggml).
Running Microsoft phi 1.5 Example
Here, we provide two examples of how to run Microsoft phi 1.5 written in Rust using a Candle-compiled WASM binary and runtime.
Vanilla JS and WebWorkers
To build and test the UI made in Vanilla JS and WebWorkers, first we need to build the WASM library:
sh build-lib.sh
This will bundle the library under ./build
and we can import it inside our WebWorker like a normal JS module:
import init, { Model } from "./build/m.js";
The full example can be found under ./index.html
. All needed assets are fetched from the web, so no need to download anything.
Finally, you can preview the example by running a local HTTP server. For example:
python -m http.server
Then open http://localhost:8000/index.html
in your browser.