candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-19 19:58:35 +00:00

Author	SHA1	Message	Date
Laurent Mazare	c093b03d51	Generic implementation of vecdot for q80. (#596 ) * Generic implementation of vecdot for q80. * Add support for code-llama 7b. * Support more code-llama.	2023-08-25 09:04:05 +01:00
Laurent Mazare	d8ba0452dc	Fail on bf16. (#594 )	2023-08-25 06:10:38 +01:00
Laurent Mazare	2cde0cb74b	More pickle support. (#588 ) * More pickle support. * Be more verbose.	2023-08-24 18:45:10 +01:00
Laurent Mazare	e21c686cdc	Fixes for clippy 1.72. (#587 )	2023-08-24 17:46:17 +01:00
Laurent Mazare	c265ac50fa	Add a function to write gguf files. (#585 ) * Add a function to write gguf files. * More GGUF file writing. * Write the tensor data in GGUF files.	2023-08-24 17:03:06 +01:00
Laurent Mazare	afd965f77c	More non square testing (#582 ) * Add more non square testing. * More testing.	2023-08-24 13:01:04 +01:00
Lukas Kreussel	d2f42ab086	Referenze implementations of `q2k` and `q3k` vec-dot functions (#580 ) * add `q2k` vec-dot * `q3k` vec-dot + quantization bugfix	2023-08-24 12:35:54 +01:00
Laurent Mazare	ca318a6ec7	Add to the cuda example a reproduction of the issue. (#579 ) * Add to the cuda example a reproduction of the issue. * Tweak. * Add a test using non-square matrixes. * Fix the conv2d kernel. * Display the error. * And tweak the comment.	2023-08-24 12:07:31 +01:00
Laurent Mazare	dd64465899	Add a test for conv2d with padding + bugfix the random number generation on cuda. (#578 ) * Add a test for conv2d with padding. * Cosmetic changes. * Bugfix the rand function on the cuda backend.	2023-08-24 10:16:37 +01:00
Laurent Mazare	431051cc32	Add Efficientnet (#572 ) * EfficientNet. * Complete the efficientnet implementation. * Improve group handling. * Get the efficientnet to work.	2023-08-23 18:02:58 +01:00
Laurent Mazare	7478dda255	Cosmetic tweaks. (#570 )	2023-08-23 15:45:40 +01:00
Laurent Mazare	329f661d9b	Trace softmax (#568 ) * Trace the softmax op. * Inline the sum. * Add min/max vec operations.	2023-08-23 15:25:50 +01:00
Lukas Kreussel	075b505480	Mirror GGML's unit tests (#569 ) * Add ggml unit tests * simplify random matmul test for other test cases	2023-08-23 15:25:17 +01:00
Laurent Mazare	aba1e90797	Add some group parameter to convolutions. (#566 ) * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.	2023-08-23 12:58:55 +01:00
Laurent Mazare	9a5c7db91a	Add support for i64 (#563 ) * Add the i64 dtype. * Adapt the cuda kernels.	2023-08-23 10:42:19 +01:00
Laurent Mazare	508d34daf2	GGUF support in the quantized model. (#559 ) * GGUF support in the quantized model. * Get the GGUF support to work on llama.	2023-08-23 09:20:57 +01:00
Laurent Mazare	0764741cc4	Handle GGUF files in tensor-tools. (#558 )	2023-08-23 06:32:07 +01:00
Laurent Mazare	6a30ecefad	Preliminary GGUF support. (#557 ) * Preliminary GGUF support. * Tensor reading.	2023-08-23 00:14:10 +01:00
Laurent Mazare	07067b01dc	Avoid some mutable variables (take 2). (#554 ) * Avoid some mutable variables (take 2). * Fix.	2023-08-22 18:51:20 +01:00
Laurent Mazare	ec665acad7	Revert "Avoid some mut in quantized functions. (#550 )" (#552 ) This reverts commit `cf27b9b636`.	2023-08-22 15:57:46 +01:00
Laurent Mazare	cf27b9b636	Avoid some mut in quantized functions. (#550 ) * Avoid a couple more 'let mut'. * Tweaks.	2023-08-22 15:44:26 +01:00
Lukas Kreussel	352383cbc3	Add quantization support for `q2k`, `q3k`, `q4k` and `q5k` (#524 ) * first q2 implementation * First Q4K and Q5K implementations * fix `q2k` and `q5k` * Some first cleanups * run `clippy` on tests * finally implement `q3k` * deactivate `q3k` test on macos * also disable the test on linux * Fix floating bits in `q3k` dequantization * Refactoring pass + reorder quants in file * `fmt` * Re-add `src` asserts and redefine `dst`	2023-08-22 15:04:55 +01:00
Laurent Mazare	d70cffdab6	Fix the minimum/maximum gradient computations. (#534 )	2023-08-21 08:28:41 +01:00
Laurent Mazare	8c232d706b	Small tweaks to the pickle handling to be able to use libtorch files. (#530 ) * Small tweaks to the pickle handling to be able to use libtorch files. * Move the pytorch specific bits in a different function.	2023-08-20 23:25:34 +01:00
Laurent Mazare	11c7e7bd67	Some fixes for yolo-v3. (#529 ) * Some fixes for yolo-v3. * Use the running stats for inference in the batch-norm layer. * Get some proper predictions for yolo. * Avoid the quadratic insertion.	2023-08-20 23:19:15 +01:00
Laurent Mazare	a1812f934f	Add a yolo-v3 example. (#528 ) * Add a couple functions required for yolo. * Add the yolo-v3 example. * Add minimum and maximum. * Use the newly introduced maximum. * Cuda support for min/max + add some testing. * Allow for more tests to work with accelerate. * Fix a typo.	2023-08-20 18:19:37 +01:00
Laurent Mazare	e3d2786ffb	Add a couple functions required for yolo. (#527 )	2023-08-20 17:02:05 +01:00
Laurent Mazare	2fcb386f17	Add a broadcast variant to matmul. (#523 ) * Add a broadcast variant to matmul. * Get the test to pass.	2023-08-20 13:20:42 +01:00
Laurent Mazare	a8f61e66cc	Bump the crates version to 0.1.2. (#522 )	2023-08-20 08:07:07 +01:00
Laurent Mazare	82410995a2	Neon support for quantization. (#519 ) * Skeleton files for neon support of quantization. * SIMD version for q4 vecdot. * Also simdify the q6k multiplication.	2023-08-19 22:07:29 +01:00
Laurent Mazare	551409092e	Small tweaks to tensor-tools. (#517 )	2023-08-19 16:50:26 +01:00
Laurent Mazare	6431140250	Retrieve tensor data from PyTorch files. (#516 )	2023-08-19 15:57:18 +01:00
Laurent Mazare	607ffb9f1e	Retrieve more information from PyTorch checkpoints. (#515 ) * Retrieve more information from PyTorch checkpoints. * Add enough support to load dino-v2 backbone weights.	2023-08-19 15:05:34 +01:00
Laurent Mazare	f861a9df6e	Add ggml support to tensor-tools (#512 ) * Pickle work-in-progress. * More unpickling. * More pickling. * Proper handling of setitems. * Clippy. * Again more pickling. * Restore the example. * Add enough pickle support to get the list of tensors. * Read the data from zip files. * Retrieve the tensor shape. * Extract the size and dtype. * More storage types. * Improve the destructuring. * Also support ggml files.	2023-08-19 11:45:22 +01:00
Laurent Mazare	ad33715c61	Preliminary support for importing PyTorch weights. (#511 ) * Pickle work-in-progress. * More unpickling. * More pickling. * Proper handling of setitems. * Clippy. * Again more pickling. * Restore the example. * Add enough pickle support to get the list of tensors. * Read the data from zip files. * Retrieve the tensor shape. * Extract the size and dtype. * More storage types. * Improve the destructuring.	2023-08-19 11:26:32 +01:00
Laurent Mazare	90ff04e77e	Add the tensor-tools binary. (#510 )	2023-08-19 09:06:44 +01:00
Laurent Mazare	cb069d6063	Add the permute op (similar to pytorch). (#504 ) * Add the permute op (similar to pytorch). * Add the backprop for dimension permutation.	2023-08-18 16:30:53 +01:00
Laurent Mazare	95462c6a2e	Add a vision transformer example (dino-v2). (#502 ) * Add a vision transformer example (dino-v2). * Add some documentation + test. * CI fix. * Another fix (still unable to replicate the errors locally :( )	2023-08-18 11:58:06 +01:00
Lukas Kreussel	109e95b189	Basic `qmatmul` parallelization (#492 ) * Basic `par_iter` parallelization * Pass errors up * Disable `avx` for x86 macs	2023-08-18 09:45:37 +01:00
Laurent Mazare	c78ce76501	Add a simple Module trait and implement it for the various nn layers (#500 ) * Start adding the module trait. * Use the module trait. * Implement module for qmatmul.	2023-08-18 09:38:22 +01:00
Laurent Mazare	a22b1bed7b	Tensor -> QTensor conversion (#496 ) * Sketch some qmatmul test. * Add the quantization function. * More testing. * Make the test smaller and faster. * Add some shape checking.	2023-08-18 08:19:20 +01:00
Laurent Mazare	557b2c28dd	Q6K quantization (#495 ) * Print the detected arch options. * Add the q6k quantization. * Add a currently broken test. * Bugfix. * Bugfix. * Another bugfix. * Another bugfix + get the test to work.	2023-08-17 22:22:57 +01:00
Laurent Mazare	fc81af1712	AVX version of the q6k vec-dot. (#493 ) * AVX version of the q6k vec-dot. * Use the avx sum.	2023-08-17 20:13:18 +01:00
Laurent Mazare	03be33eea4	Relax the requirements on CustomOp. (#486 ) * Relax the requirements on CustomOp. * Simplify the custom-ops when no backward is required.	2023-08-17 11:12:05 +01:00
Laurent Mazare	d99cac3ec3	Move the avx specific bits to a separate file. (#481 )	2023-08-17 09:01:06 +01:00
Laurent Mazare	306c8eee7a	AVX version of the vecdot for q4_0. (#474 ) * AVX version of the vecdot for q4_0. * Tweak the avx bits. * Add a qmatmul benchmark. * Fix the quantized test.	2023-08-17 07:03:32 +01:00
Laurent Mazare	098909de40	Add vecdot for q6k-q8k. (#476 ) * Add vecdot for q6k-q8k. * Add some testing for q8k. * Use QMatMul for the output layer.	2023-08-16 20:59:40 +01:00
Laurent Mazare	3bedba1fce	Use a zipped iterator. (#475 ) * Use a zipped iterator. * Add to/from float for q8k.	2023-08-16 20:15:11 +01:00
Laurent Mazare	575e88a999	Add a quantized test that use negative values. (#470 ) * Add a quantized test that use negative values. * Add a default tokenizer.	2023-08-16 16:32:58 +01:00
Laurent Mazare	a9101700b6	Add a kv-cache to the quantized llama example. (#466 ) * Add a kv-cache to the quantized llama example. * Also print the prompt. * Bugfix in q6k dequantizing. * Another bugfix.	2023-08-16 14:28:42 +01:00

... 6 7 8 9 10 ...

692 Commits