* added chatGLM readme
* changed wording in readme
* added readme for chinese-clip
* added readme for convmixer
* added readme for custom ops
* added readme for efficientnet
* added readme for llama
* added readme to mnist-training
* added readme to musicgen
* added readme to quantized-phi
* added readme to starcoder2
* added readme to whisper-microphone
* added readme to yi
* added readme to yolo-v3
* added readme to whisper-microphone
* added space to example in glm4 readme
* fixed mamba example readme to run mamba instead of mamba-minimal
* removed slash escape character
* changed moondream image to yolo-v8 example image
* added procedure for making the reinforcement-learning example work with a virtual environment on my machine
* added simple one line summaries to the example readmes without
* changed non-existant image to yolo example's bike.jpg
* added backslash to sam command
* removed trailing - from siglip
* added SoX to silero-vad example readme
* replaced procedure for uv on mac with warning that uv isn't currently compatible with pyo3
* added example to falcon readme
* added --which arg to stella-en-v5 readme
* fixed image path in vgg readme
* fixed the image path in the vit readme
* Update README.md
* Update README.md
* Update README.md
---------
Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>
* Metal quantized modifications proposal.
- Add a device param, wherever needed.
- Create new QMetal storage thing that implements QuantizedType.
- Update everywhere needed.
Fix Python.
Fixing examples.
Fix: fmt + clippy + stub.
Moving everything around.
Only missing the actual implems.
Fixing everything + adding dequantized kernels.
More work.
Fixing matmul.
Fmt + Clippy
Some clippy fixes.
Working state.
Q2K Metal -> Bugged (also present in GGML).
Q4K CPU -> Bugged (present previously, new test catch it).
Q5K CPU -> Bugged (present previously).
Q8_1 Both -> Never really implemented it seems
Q8K metal -> Never implemented in metal
Fixing Q2K bug (present in ggml).
* Cleanup.
* Fix the rebase.
* Removing the fences speeds everything up and *is* correct this time...
* Cleanup the fence.
* After rebase.
* Bad code removal.
* Rebase after phi2 merge + fix replit default to CPU.
* Making the CI happy.
* More happy tests.
---------
Co-authored-by: Nicolas Patry <nicolas@Nicolass-MacBook-Pro.local>
* Load gguf files for the quantized t5.
* Add the quantized t5 example.
* Allow for loading local files.
* Add some support for quantizing safetensor files.
* Transpose before quantizing.
* Quantized t5.
* Retrieve the weights from the hub.