Commit Graph

14 Commits

Author SHA1 Message Date
3159f91b90 20241118 docs (#2629)
* module docs

* varbuilder gguf docs

* add a link to gguf files

* small additonal mod doc titles

* safetensor docs

* more core docs

* more module docs in canlde_core

* 2 more link fixes
2024-11-19 04:07:07 +01:00
e2b6b367fa Add some fast Metal MLX SDPA kernels (#2584)
* Add some fast Metal MLX SDPA kernels (#32)

* Sketch the sdpa kernel

* Add full sdpa kernel,

* Add test

* Add vectorized kernel for decoding

* Update tests

* Add some docs

* Fix sdpa_vector names

* Add softcapping for vectorized sdpa

* Add softcapping for full sdpa

* Add support for head dim 32, 96, 256

* Add support for head dim 32, 96, 256

* Update docs

* Add update notice

* Clippy and format

* Conditional compilation for bf16

* Use it in quantized llama

* Some review comments

* Use set_params!

* Remove unused

* Remove feature

* Fix metal sdpa for v stride

* Remove comma

* Add the dim method to layout and shape.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
2024-11-05 09:28:00 +01:00
9634583781 Expose a couple layout methods. (#1816) 2024-03-08 10:52:22 +01:00
607ffb9f1e Retrieve more information from PyTorch checkpoints. (#515)
* Retrieve more information from PyTorch checkpoints.

* Add enough support to load dino-v2 backbone weights.
2023-08-19 15:05:34 +01:00
cb069d6063 Add the permute op (similar to pytorch). (#504)
* Add the permute op (similar to pytorch).

* Add the backprop for dimension permutation.
2023-08-18 16:30:53 +01:00
acb2f90469 Broadcasting performance optimization (cpu) (#182)
* Avoid recomputing the index from scratch each time.

* More performance optimisations.
2023-07-17 13:41:09 +01:00
18ea92d83b Iteration over strided blocks (#175)
* Introduce the strided blocks.

* Use the strided blocks to fasten the copy.

* Add more testing.
2023-07-15 21:30:35 +01:00
d88b6cdca9 Add backtrace information to errors where relevant. (#166)
* Add backtrace information to errors where relevant.

* More backtrace information.

* Add to the FAQ.
2023-07-14 09:31:25 +01:00
a76ec797da Cleanup the main crate error and add a couple dedicated ones (#142)
* Cosmetic cleanups to the error enum.

* More error cleanup.

* Proper error handling rather than panicing.

* Add some conv1d dedicated error.
2023-07-12 09:17:08 +01:00
14449ff80c Get the cpu backend to compile. 2023-06-28 14:12:38 +01:00
54a6c40f27 Propagate the changes on the cpu backend. 2023-06-28 14:00:49 +01:00
303b853098 Propagate the layout refactoring. 2023-06-28 13:42:23 +01:00
30b355ccd2 Simplify the narrow implementation. 2023-06-28 13:09:59 +01:00
c1bbbf94f6 Start refactoring the stride. 2023-06-28 12:57:30 +01:00