candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 18:48:51 +00:00

Author	SHA1	Message	Date
Laurent Mazare	06b37ea7ad	Avoid using tmp values. (#609 )	2023-08-26 12:28:28 +01:00
Lukas Kreussel	c72eb3d75b	Add reference implementation for `q4k` and `q5k` (#586 ) * add `q2k` vec-dot * `q3k` vec-dot + quantization bugfix * `q4k` vec-dot * `q5k` vec-dot * Validate against GGML unit test results. * Remove some more `transmutes`	2023-08-26 12:07:54 +01:00
Radamés Ajna	864227edbf	[WIP] Improve Yolo WASM UI example (#591 ) * return detections with classes names * ignore .DS_Store * example how to load wasm module * add param to set model size * add param for model size * accept iou and confidence threshold on run * conf and iou thresholds * clamp only * remove images from branch * a couple of renamings, add readme with instructions * final design * minor font + border update	2023-08-26 11:40:41 +01:00
Nicolas Patry	b23b347b35	Merge pull request #601 from huggingface/repair_bf16_f16_cast Repairing cast bf16/f16	2023-08-26 12:34:41 +02:00
Patrick von Platen	71518caeee	Align tensor device print more with PyTorch (#590 ) * Improve tensor print * Use CudaDevice only if enabled with cuda feature * run rust fmt * up * improve * rustfmt	2023-08-26 11:20:22 +01:00
Laurent Mazare	6559eae72c	Avoid some transmutes. (#607 )	2023-08-25 18:21:37 +01:00
Laurent Mazare	46eb225ba5	Add some missing entries to the changelog. (#606 )	2023-08-25 18:01:38 +01:00
Nicolas Patry	aa67e5107d	Merge pull request #600 from huggingface/codellama_gpu_support Adding support for codellama in examples.	2023-08-25 18:25:26 +02:00
Nicolas Patry	c105550405	s/panic/bail/	2023-08-25 18:05:07 +02:00
Laurent Mazare	ca6c050b04	Cleanup the pose reporting code. (#605 )	2023-08-25 16:49:21 +01:00
Laurent Mazare	9c8d6dbc2a	Neon intrinsics for the q8_0 vecdot. (#604 ) * Neon intrinsics for the q8_0 vecdot. * Get the tests to run with accelerate (with some numerical error failures).	2023-08-25 14:42:18 +01:00
Laurent Mazare	0afbc435df	Add some configurable legend for yolo detection. (#603 ) * Add some configurable legend for yolo detection. * Clippyness.	2023-08-25 13:50:31 +01:00
Nicolas Patry	d4e75d5825	Let's keep the dirty code on its own.	2023-08-25 12:01:58 +00:00
Nicolas Patry	be371e827c	Intermediary float cast is necessary for cuda 11.8	2023-08-25 11:54:30 +00:00
Laurent Mazare	97909e5068	Move the yolo model bits in a separate file. (#602 ) * Move the yolo model bits in a separate file. * Improve the drawing. * Bugfix.	2023-08-25 12:47:55 +01:00
Nicolas Patry	1c1e34735e	`static_cast` ?	2023-08-25 11:40:36 +00:00
Nicolas Patry	db8bab8b7a	Different casting ?	2023-08-25 10:49:22 +00:00
Nicolas Patry	bc131b402b	Repairing cast bf16/f16	2023-08-25 10:38:19 +00:00
Laurent Mazare	8bc5fffa45	More support for pose estimation in yolo-v8. (#599 ) * More support for pose estimation in yolo-v8. * Support both object detection and pose-estimation in the yolo-v8 example.	2023-08-25 11:21:11 +01:00
Nicolas Patry	4826a4212e	Adding support for codellama in examples. Codellama requires bf16 for now (error to convert from bf16 to f16). Multiprocess demo not functional for it because flash-attn only supports f16 for now.	2023-08-25 09:56:11 +00:00
Laurent Mazare	afc10a3232	AVX version for the q8-0 multiplications. (#598 )	2023-08-25 10:14:49 +01:00
Laurent Mazare	d728e646c2	Use resolver 2 explicitely. (#597 )	2023-08-25 09:35:40 +01:00
Laurent Mazare	c093b03d51	Generic implementation of vecdot for q80. (#596 ) * Generic implementation of vecdot for q80. * Add support for code-llama 7b. * Support more code-llama.	2023-08-25 09:04:05 +01:00
Laurent Mazare	d8ba0452dc	Fail on bf16. (#594 )	2023-08-25 06:10:38 +01:00
Laurent Mazare	189442a0fa	Add the pose estimation head for yolo. (#589 ) * Add the pose estimation head for yolo. * Properly handle the added position dimensions. * Integrate the pose estimation head in the forward pass. * Renaming. * Fix for pose estimation.	2023-08-24 22:12:34 +01:00
Laurent Mazare	2cde0cb74b	More pickle support. (#588 ) * More pickle support. * Be more verbose.	2023-08-24 18:45:10 +01:00
Laurent Mazare	e21c686cdc	Fixes for clippy 1.72. (#587 )	2023-08-24 17:46:17 +01:00
Laurent Mazare	c265ac50fa	Add a function to write gguf files. (#585 ) * Add a function to write gguf files. * More GGUF file writing. * Write the tensor data in GGUF files.	2023-08-24 17:03:06 +01:00
Nicolas Patry	a87c6f7652	Merge pull request #561 from patrickvonplaten/add_installation Improve installation section and "get started"	2023-08-24 16:25:52 +02:00
Laurent Mazare	afd965f77c	More non square testing (#582 ) * Add more non square testing. * More testing.	2023-08-24 13:01:04 +01:00
Lukas Kreussel	d2f42ab086	Referenze implementations of `q2k` and `q3k` vec-dot functions (#580 ) * add `q2k` vec-dot * `q3k` vec-dot + quantization bugfix	2023-08-24 12:35:54 +01:00
Laurent Mazare	ca318a6ec7	Add to the cuda example a reproduction of the issue. (#579 ) * Add to the cuda example a reproduction of the issue. * Tweak. * Add a test using non-square matrixes. * Fix the conv2d kernel. * Display the error. * And tweak the comment.	2023-08-24 12:07:31 +01:00
Laurent Mazare	dd64465899	Add a test for conv2d with padding + bugfix the random number generation on cuda. (#578 ) * Add a test for conv2d with padding. * Cosmetic changes. * Bugfix the rand function on the cuda backend.	2023-08-24 10:16:37 +01:00
Laurent Mazare	79916c2edb	Use the hub weights for efficientnet. (#573 )	2023-08-23 18:20:21 +01:00
Laurent Mazare	431051cc32	Add Efficientnet (#572 ) * EfficientNet. * Complete the efficientnet implementation. * Improve group handling. * Get the efficientnet to work.	2023-08-23 18:02:58 +01:00
Laurent Mazare	eedd85ffa7	Move the imagenet specific bits to a separate file. (#571 )	2023-08-23 16:42:09 +01:00
Laurent Mazare	7478dda255	Cosmetic tweaks. (#570 )	2023-08-23 15:45:40 +01:00
Laurent Mazare	329f661d9b	Trace softmax (#568 ) * Trace the softmax op. * Inline the sum. * Add min/max vec operations.	2023-08-23 15:25:50 +01:00
Lukas Kreussel	075b505480	Mirror GGML's unit tests (#569 ) * Add ggml unit tests * simplify random matmul test for other test cases	2023-08-23 15:25:17 +01:00
Laurent Mazare	aba1e90797	Add some group parameter to convolutions. (#566 ) * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.	2023-08-23 12:58:55 +01:00
Patrick von Platen	1f58bdbb1d	Apply suggestions from code review	2023-08-23 13:33:45 +02:00
Patrick von Platen	c98d3cfd8b	Update candle-book/src/guide/installation.md	2023-08-23 13:31:54 +02:00
Patrick von Platen	c5e43ad0ab	Apply suggestions from code review	2023-08-23 13:27:29 +02:00
Patrick von Platen	2c280007e8	Apply suggestions from code review	2023-08-23 13:26:21 +02:00
Laurent Mazare	4ee1cf038a	Get the rms epsilon from GGUF. (#565 )	2023-08-23 11:40:20 +01:00
Laurent Mazare	0f4ff8a739	Fix the quantized example. (#564 )	2023-08-23 11:09:55 +01:00
cksac	89a00b56cc	add chat models in quantized example (#551 ) * add chat models in quantized example * cargo fmt	2023-08-23 11:05:33 +01:00
Laurent Mazare	9a5c7db91a	Add support for i64 (#563 ) * Add the i64 dtype. * Adapt the cuda kernels.	2023-08-23 10:42:19 +01:00
Patrick von Platen	649202024c	fix code snippets	2023-08-23 09:05:07 +00:00
Patrick von Platen	283f6c048d	fix code snippets	2023-08-23 09:04:36 +00:00

1 2 3 4 5 ...

992 Commits