c093b03d51
Generic implementation of vecdot for q80. ( #596 )
...
* Generic implementation of vecdot for q80.
* Add support for code-llama 7b.
* Support more code-llama.
2023-08-25 09:04:05 +01:00
d8ba0452dc
Fail on bf16. ( #594 )
2023-08-25 06:10:38 +01:00
2cde0cb74b
More pickle support. ( #588 )
...
* More pickle support.
* Be more verbose.
2023-08-24 18:45:10 +01:00
e21c686cdc
Fixes for clippy 1.72. ( #587 )
2023-08-24 17:46:17 +01:00
c265ac50fa
Add a function to write gguf files. ( #585 )
...
* Add a function to write gguf files.
* More GGUF file writing.
* Write the tensor data in GGUF files.
2023-08-24 17:03:06 +01:00
afd965f77c
More non square testing ( #582 )
...
* Add more non square testing.
* More testing.
2023-08-24 13:01:04 +01:00
d2f42ab086
Referenze implementations of q2k
and q3k
vec-dot functions ( #580 )
...
* add `q2k` vec-dot
* `q3k` vec-dot + quantization bugfix
2023-08-24 12:35:54 +01:00
ca318a6ec7
Add to the cuda example a reproduction of the issue. ( #579 )
...
* Add to the cuda example a reproduction of the issue.
* Tweak.
* Add a test using non-square matrixes.
* Fix the conv2d kernel.
* Display the error.
* And tweak the comment.
2023-08-24 12:07:31 +01:00
dd64465899
Add a test for conv2d with padding + bugfix the random number generation on cuda. ( #578 )
...
* Add a test for conv2d with padding.
* Cosmetic changes.
* Bugfix the rand function on the cuda backend.
2023-08-24 10:16:37 +01:00
431051cc32
Add Efficientnet ( #572 )
...
* EfficientNet.
* Complete the efficientnet implementation.
* Improve group handling.
* Get the efficientnet to work.
2023-08-23 18:02:58 +01:00
7478dda255
Cosmetic tweaks. ( #570 )
2023-08-23 15:45:40 +01:00
329f661d9b
Trace softmax ( #568 )
...
* Trace the softmax op.
* Inline the sum.
* Add min/max vec operations.
2023-08-23 15:25:50 +01:00
075b505480
Mirror GGML's unit tests ( #569 )
...
* Add ggml unit tests
* simplify random matmul test for other test cases
2023-08-23 15:25:17 +01:00
aba1e90797
Add some group parameter to convolutions. ( #566 )
...
* Add some group parameter to convolutions.
* Avoid some unnecessary groups checks.
* Move the tensor convolution bits.
* Properh handling of groups.
* Bump the crate version.
* And add a changelog.
2023-08-23 12:58:55 +01:00
9a5c7db91a
Add support for i64 ( #563 )
...
* Add the i64 dtype.
* Adapt the cuda kernels.
2023-08-23 10:42:19 +01:00
508d34daf2
GGUF support in the quantized model. ( #559 )
...
* GGUF support in the quantized model.
* Get the GGUF support to work on llama.
2023-08-23 09:20:57 +01:00
0764741cc4
Handle GGUF files in tensor-tools. ( #558 )
2023-08-23 06:32:07 +01:00
6a30ecefad
Preliminary GGUF support. ( #557 )
...
* Preliminary GGUF support.
* Tensor reading.
2023-08-23 00:14:10 +01:00
07067b01dc
Avoid some mutable variables (take 2). ( #554 )
...
* Avoid some mutable variables (take 2).
* Fix.
2023-08-22 18:51:20 +01:00
ec665acad7
Revert "Avoid some mut in quantized functions. ( #550 )" ( #552 )
...
This reverts commit cf27b9b636
.
2023-08-22 15:57:46 +01:00
cf27b9b636
Avoid some mut in quantized functions. ( #550 )
...
* Avoid a couple more 'let mut'.
* Tweaks.
2023-08-22 15:44:26 +01:00
352383cbc3
Add quantization support for q2k
, q3k
, q4k
and q5k
( #524 )
...
* first q2 implementation
* First Q4K and Q5K implementations
* fix `q2k` and `q5k`
* Some first cleanups
* run `clippy` on tests
* finally implement `q3k`
* deactivate `q3k` test on macos
* also disable the test on linux
* Fix floating bits in `q3k` dequantization
* Refactoring pass + reorder quants in file
* `fmt`
* Re-add `src` asserts and redefine `dst`
2023-08-22 15:04:55 +01:00
d70cffdab6
Fix the minimum/maximum gradient computations. ( #534 )
2023-08-21 08:28:41 +01:00
8c232d706b
Small tweaks to the pickle handling to be able to use libtorch files. ( #530 )
...
* Small tweaks to the pickle handling to be able to use libtorch files.
* Move the pytorch specific bits in a different function.
2023-08-20 23:25:34 +01:00
11c7e7bd67
Some fixes for yolo-v3. ( #529 )
...
* Some fixes for yolo-v3.
* Use the running stats for inference in the batch-norm layer.
* Get some proper predictions for yolo.
* Avoid the quadratic insertion.
2023-08-20 23:19:15 +01:00
a1812f934f
Add a yolo-v3 example. ( #528 )
...
* Add a couple functions required for yolo.
* Add the yolo-v3 example.
* Add minimum and maximum.
* Use the newly introduced maximum.
* Cuda support for min/max + add some testing.
* Allow for more tests to work with accelerate.
* Fix a typo.
2023-08-20 18:19:37 +01:00
e3d2786ffb
Add a couple functions required for yolo. ( #527 )
2023-08-20 17:02:05 +01:00
2fcb386f17
Add a broadcast variant to matmul. ( #523 )
...
* Add a broadcast variant to matmul.
* Get the test to pass.
2023-08-20 13:20:42 +01:00
a8f61e66cc
Bump the crates version to 0.1.2. ( #522 )
2023-08-20 08:07:07 +01:00
82410995a2
Neon support for quantization. ( #519 )
...
* Skeleton files for neon support of quantization.
* SIMD version for q4 vecdot.
* Also simdify the q6k multiplication.
2023-08-19 22:07:29 +01:00
551409092e
Small tweaks to tensor-tools. ( #517 )
2023-08-19 16:50:26 +01:00
6431140250
Retrieve tensor data from PyTorch files. ( #516 )
2023-08-19 15:57:18 +01:00
607ffb9f1e
Retrieve more information from PyTorch checkpoints. ( #515 )
...
* Retrieve more information from PyTorch checkpoints.
* Add enough support to load dino-v2 backbone weights.
2023-08-19 15:05:34 +01:00
f861a9df6e
Add ggml support to tensor-tools ( #512 )
...
* Pickle work-in-progress.
* More unpickling.
* More pickling.
* Proper handling of setitems.
* Clippy.
* Again more pickling.
* Restore the example.
* Add enough pickle support to get the list of tensors.
* Read the data from zip files.
* Retrieve the tensor shape.
* Extract the size and dtype.
* More storage types.
* Improve the destructuring.
* Also support ggml files.
2023-08-19 11:45:22 +01:00
ad33715c61
Preliminary support for importing PyTorch weights. ( #511 )
...
* Pickle work-in-progress.
* More unpickling.
* More pickling.
* Proper handling of setitems.
* Clippy.
* Again more pickling.
* Restore the example.
* Add enough pickle support to get the list of tensors.
* Read the data from zip files.
* Retrieve the tensor shape.
* Extract the size and dtype.
* More storage types.
* Improve the destructuring.
2023-08-19 11:26:32 +01:00
90ff04e77e
Add the tensor-tools binary. ( #510 )
2023-08-19 09:06:44 +01:00
cb069d6063
Add the permute op (similar to pytorch). ( #504 )
...
* Add the permute op (similar to pytorch).
* Add the backprop for dimension permutation.
2023-08-18 16:30:53 +01:00
95462c6a2e
Add a vision transformer example (dino-v2). ( #502 )
...
* Add a vision transformer example (dino-v2).
* Add some documentation + test.
* CI fix.
* Another fix (still unable to replicate the errors locally :( )
2023-08-18 11:58:06 +01:00
109e95b189
Basic qmatmul
parallelization ( #492 )
...
* Basic `par_iter` parallelization
* Pass errors up
* Disable `avx` for x86 macs
2023-08-18 09:45:37 +01:00
c78ce76501
Add a simple Module trait and implement it for the various nn layers ( #500 )
...
* Start adding the module trait.
* Use the module trait.
* Implement module for qmatmul.
2023-08-18 09:38:22 +01:00
a22b1bed7b
Tensor -> QTensor conversion ( #496 )
...
* Sketch some qmatmul test.
* Add the quantization function.
* More testing.
* Make the test smaller and faster.
* Add some shape checking.
2023-08-18 08:19:20 +01:00
557b2c28dd
Q6K quantization ( #495 )
...
* Print the detected arch options.
* Add the q6k quantization.
* Add a currently broken test.
* Bugfix.
* Bugfix.
* Another bugfix.
* Another bugfix + get the test to work.
2023-08-17 22:22:57 +01:00
fc81af1712
AVX version of the q6k vec-dot. ( #493 )
...
* AVX version of the q6k vec-dot.
* Use the avx sum.
2023-08-17 20:13:18 +01:00
03be33eea4
Relax the requirements on CustomOp. ( #486 )
...
* Relax the requirements on CustomOp.
* Simplify the custom-ops when no backward is required.
2023-08-17 11:12:05 +01:00
d99cac3ec3
Move the avx specific bits to a separate file. ( #481 )
2023-08-17 09:01:06 +01:00
306c8eee7a
AVX version of the vecdot for q4_0. ( #474 )
...
* AVX version of the vecdot for q4_0.
* Tweak the avx bits.
* Add a qmatmul benchmark.
* Fix the quantized test.
2023-08-17 07:03:32 +01:00
098909de40
Add vecdot for q6k-q8k. ( #476 )
...
* Add vecdot for q6k-q8k.
* Add some testing for q8k.
* Use QMatMul for the output layer.
2023-08-16 20:59:40 +01:00
3bedba1fce
Use a zipped iterator. ( #475 )
...
* Use a zipped iterator.
* Add to/from float for q8k.
2023-08-16 20:15:11 +01:00
575e88a999
Add a quantized test that use negative values. ( #470 )
...
* Add a quantized test that use negative values.
* Add a default tokenizer.
2023-08-16 16:32:58 +01:00
a9101700b6
Add a kv-cache to the quantized llama example. ( #466 )
...
* Add a kv-cache to the quantized llama example.
* Also print the prompt.
* Bugfix in q6k dequantizing.
* Another bugfix.
2023-08-16 14:28:42 +01:00