candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	d728e646c2	Use resolver 2 explicitely. (#597 )	2023-08-25 09:35:40 +01:00
Laurent Mazare	c093b03d51	Generic implementation of vecdot for q80. (#596 ) * Generic implementation of vecdot for q80. * Add support for code-llama 7b. * Support more code-llama.	2023-08-25 09:04:05 +01:00
Laurent Mazare	d8ba0452dc	Fail on bf16. (#594 )	2023-08-25 06:10:38 +01:00
Laurent Mazare	189442a0fa	Add the pose estimation head for yolo. (#589 ) * Add the pose estimation head for yolo. * Properly handle the added position dimensions. * Integrate the pose estimation head in the forward pass. * Renaming. * Fix for pose estimation.	2023-08-24 22:12:34 +01:00
Laurent Mazare	2cde0cb74b	More pickle support. (#588 ) * More pickle support. * Be more verbose.	2023-08-24 18:45:10 +01:00
Laurent Mazare	e21c686cdc	Fixes for clippy 1.72. (#587 )	2023-08-24 17:46:17 +01:00
Laurent Mazare	c265ac50fa	Add a function to write gguf files. (#585 ) * Add a function to write gguf files. * More GGUF file writing. * Write the tensor data in GGUF files.	2023-08-24 17:03:06 +01:00
Nicolas Patry	a87c6f7652	Merge pull request #561 from patrickvonplaten/add_installation Improve installation section and "get started"	2023-08-24 16:25:52 +02:00
Laurent Mazare	afd965f77c	More non square testing (#582 ) * Add more non square testing. * More testing.	2023-08-24 13:01:04 +01:00
Lukas Kreussel	d2f42ab086	Referenze implementations of `q2k` and `q3k` vec-dot functions (#580 ) * add `q2k` vec-dot * `q3k` vec-dot + quantization bugfix	2023-08-24 12:35:54 +01:00
Laurent Mazare	ca318a6ec7	Add to the cuda example a reproduction of the issue. (#579 ) * Add to the cuda example a reproduction of the issue. * Tweak. * Add a test using non-square matrixes. * Fix the conv2d kernel. * Display the error. * And tweak the comment.	2023-08-24 12:07:31 +01:00
Laurent Mazare	dd64465899	Add a test for conv2d with padding + bugfix the random number generation on cuda. (#578 ) * Add a test for conv2d with padding. * Cosmetic changes. * Bugfix the rand function on the cuda backend.	2023-08-24 10:16:37 +01:00
Laurent Mazare	79916c2edb	Use the hub weights for efficientnet. (#573 )	2023-08-23 18:20:21 +01:00
Laurent Mazare	431051cc32	Add Efficientnet (#572 ) * EfficientNet. * Complete the efficientnet implementation. * Improve group handling. * Get the efficientnet to work.	2023-08-23 18:02:58 +01:00
Laurent Mazare	eedd85ffa7	Move the imagenet specific bits to a separate file. (#571 )	2023-08-23 16:42:09 +01:00
Laurent Mazare	7478dda255	Cosmetic tweaks. (#570 )	2023-08-23 15:45:40 +01:00
Laurent Mazare	329f661d9b	Trace softmax (#568 ) * Trace the softmax op. * Inline the sum. * Add min/max vec operations.	2023-08-23 15:25:50 +01:00
Lukas Kreussel	075b505480	Mirror GGML's unit tests (#569 ) * Add ggml unit tests * simplify random matmul test for other test cases	2023-08-23 15:25:17 +01:00
Laurent Mazare	aba1e90797	Add some group parameter to convolutions. (#566 ) * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.	2023-08-23 12:58:55 +01:00
Patrick von Platen	1f58bdbb1d	Apply suggestions from code review	2023-08-23 13:33:45 +02:00
Patrick von Platen	c98d3cfd8b	Update candle-book/src/guide/installation.md	2023-08-23 13:31:54 +02:00
Patrick von Platen	c5e43ad0ab	Apply suggestions from code review	2023-08-23 13:27:29 +02:00
Patrick von Platen	2c280007e8	Apply suggestions from code review	2023-08-23 13:26:21 +02:00
Laurent Mazare	4ee1cf038a	Get the rms epsilon from GGUF. (#565 )	2023-08-23 11:40:20 +01:00
Laurent Mazare	0f4ff8a739	Fix the quantized example. (#564 )	2023-08-23 11:09:55 +01:00
cksac	89a00b56cc	add chat models in quantized example (#551 ) * add chat models in quantized example * cargo fmt	2023-08-23 11:05:33 +01:00
Laurent Mazare	9a5c7db91a	Add support for i64 (#563 ) * Add the i64 dtype. * Adapt the cuda kernels.	2023-08-23 10:42:19 +01:00
Patrick von Platen	649202024c	fix code snippets	2023-08-23 09:05:07 +00:00
Patrick von Platen	283f6c048d	fix code snippets	2023-08-23 09:04:36 +00:00
Patrick von Platen	c8211fc474	fix code snippets	2023-08-23 09:04:08 +00:00
Patrick von Platen	7732bf6238	correct	2023-08-23 08:54:48 +00:00
Patrick von Platen	7c0ca80d3a	move installation to book	2023-08-23 08:52:53 +00:00
Patrick von Platen	b558d08b85	improve	2023-08-23 08:42:47 +00:00
Patrick von Platen	34cb9f924f	improve	2023-08-23 08:40:23 +00:00
Patrick von Platen	d4968295a0	improve	2023-08-23 08:37:08 +00:00
Patrick von Platen	65e146c72d	Add installation section	2023-08-23 08:32:59 +00:00
Patrick von Platen	3743bed2d7	Fix the `?` operator cannot be applied to type `Device` of example (#560 ) According to the API: ```rust inp = inp.to_device(&Device::Cuda(0)?)?; ``` cannot work as `Cuda(...)` expects a type `Device` not an integer. I'd recommend to instead use `new_cuda(...)`	2023-08-23 09:29:50 +01:00
Laurent Mazare	508d34daf2	GGUF support in the quantized model. (#559 ) * GGUF support in the quantized model. * Get the GGUF support to work on llama.	2023-08-23 09:20:57 +01:00
Laurent Mazare	0764741cc4	Handle GGUF files in tensor-tools. (#558 )	2023-08-23 06:32:07 +01:00
Laurent Mazare	6a30ecefad	Preliminary GGUF support. (#557 ) * Preliminary GGUF support. * Tensor reading.	2023-08-23 00:14:10 +01:00
Laurent Mazare	7687a0f453	Also fix the aspect ratio in the wasm example. (#556 ) * Also fix the aspect ratio in the wasm example. * Add the yolo lib. * Update the build script.	2023-08-22 22:20:08 +01:00
Laurent Mazare	f9ecc84477	GQA support in the quantized model. (#555 ) * GQA support in the quantized model. * Fix the reshaping. * Fix the main llama model. * Infer the proper gqa from the model kind.	2023-08-22 19:41:10 +01:00
Laurent Mazare	07067b01dc	Avoid some mutable variables (take 2). (#554 ) * Avoid some mutable variables (take 2). * Fix.	2023-08-22 18:51:20 +01:00
Laurent Mazare	cc22d4db20	Put the transcribe token before the language one. (#553 )	2023-08-22 16:46:34 +01:00
Laurent Mazare	ec665acad7	Revert "Avoid some mut in quantized functions. (#550 )" (#552 ) This reverts commit `cf27b9b636`.	2023-08-22 15:57:46 +01:00
Laurent Mazare	cf27b9b636	Avoid some mut in quantized functions. (#550 ) * Avoid a couple more 'let mut'. * Tweaks.	2023-08-22 15:44:26 +01:00
Lukas Kreussel	352383cbc3	Add quantization support for `q2k`, `q3k`, `q4k` and `q5k` (#524 ) * first q2 implementation * First Q4K and Q5K implementations * fix `q2k` and `q5k` * Some first cleanups * run `clippy` on tests * finally implement `q3k` * deactivate `q3k` test on macos * also disable the test on linux * Fix floating bits in `q3k` dequantization * Refactoring pass + reorder quants in file * `fmt` * Re-add `src` asserts and redefine `dst`	2023-08-22 15:04:55 +01:00
Laurent Mazare	9bc811a247	Improve the aspect ratio handling on yolo-v8. (#549 ) * Fix the aspect ratio handling in yolo-v8. * Typo.	2023-08-22 14:55:33 +01:00
Laurent Mazare	bb69d89e28	Move the yolo shared bits to a common place. (#548 ) * Move the yolo shared bits to a common place. * Share more code. * Configurable thresholds.	2023-08-22 13:03:07 +01:00
Laurent Mazare	20ce3e9f39	Sketch the yolo wasm example. (#546 ) * Sketch the yolo wasm example. * Web ui. * Get the web ui to work. * UI tweaks. * More UI tweaks. * Use the natural width/height. * Add a link to the hf space in the readme.	2023-08-22 11:56:43 +01:00

1 2 3 4 5 ...

971 Commits