candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 18:48:51 +00:00

Author	SHA1	Message	Date
Laurent Mazare	e21c686cdc	Fixes for clippy 1.72. (#587 )	2023-08-24 17:46:17 +01:00
Laurent Mazare	c265ac50fa	Add a function to write gguf files. (#585 ) * Add a function to write gguf files. * More GGUF file writing. * Write the tensor data in GGUF files.	2023-08-24 17:03:06 +01:00
Nicolas Patry	a87c6f7652	Merge pull request #561 from patrickvonplaten/add_installation Improve installation section and "get started"	2023-08-24 16:25:52 +02:00
Laurent Mazare	afd965f77c	More non square testing (#582 ) * Add more non square testing. * More testing.	2023-08-24 13:01:04 +01:00
Lukas Kreussel	d2f42ab086	Referenze implementations of `q2k` and `q3k` vec-dot functions (#580 ) * add `q2k` vec-dot * `q3k` vec-dot + quantization bugfix	2023-08-24 12:35:54 +01:00
Laurent Mazare	ca318a6ec7	Add to the cuda example a reproduction of the issue. (#579 ) * Add to the cuda example a reproduction of the issue. * Tweak. * Add a test using non-square matrixes. * Fix the conv2d kernel. * Display the error. * And tweak the comment.	2023-08-24 12:07:31 +01:00
Laurent Mazare	dd64465899	Add a test for conv2d with padding + bugfix the random number generation on cuda. (#578 ) * Add a test for conv2d with padding. * Cosmetic changes. * Bugfix the rand function on the cuda backend.	2023-08-24 10:16:37 +01:00
Laurent Mazare	79916c2edb	Use the hub weights for efficientnet. (#573 )	2023-08-23 18:20:21 +01:00
Laurent Mazare	431051cc32	Add Efficientnet (#572 ) * EfficientNet. * Complete the efficientnet implementation. * Improve group handling. * Get the efficientnet to work.	2023-08-23 18:02:58 +01:00
Laurent Mazare	eedd85ffa7	Move the imagenet specific bits to a separate file. (#571 )	2023-08-23 16:42:09 +01:00
Laurent Mazare	7478dda255	Cosmetic tweaks. (#570 )	2023-08-23 15:45:40 +01:00
Laurent Mazare	329f661d9b	Trace softmax (#568 ) * Trace the softmax op. * Inline the sum. * Add min/max vec operations.	2023-08-23 15:25:50 +01:00
Lukas Kreussel	075b505480	Mirror GGML's unit tests (#569 ) * Add ggml unit tests * simplify random matmul test for other test cases	2023-08-23 15:25:17 +01:00
Laurent Mazare	aba1e90797	Add some group parameter to convolutions. (#566 ) * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.	2023-08-23 12:58:55 +01:00
Patrick von Platen	1f58bdbb1d	Apply suggestions from code review	2023-08-23 13:33:45 +02:00
Patrick von Platen	c98d3cfd8b	Update candle-book/src/guide/installation.md	2023-08-23 13:31:54 +02:00
Patrick von Platen	c5e43ad0ab	Apply suggestions from code review	2023-08-23 13:27:29 +02:00
Patrick von Platen	2c280007e8	Apply suggestions from code review	2023-08-23 13:26:21 +02:00
Laurent Mazare	4ee1cf038a	Get the rms epsilon from GGUF. (#565 )	2023-08-23 11:40:20 +01:00
Laurent Mazare	0f4ff8a739	Fix the quantized example. (#564 )	2023-08-23 11:09:55 +01:00
cksac	89a00b56cc	add chat models in quantized example (#551 ) * add chat models in quantized example * cargo fmt	2023-08-23 11:05:33 +01:00
Laurent Mazare	9a5c7db91a	Add support for i64 (#563 ) * Add the i64 dtype. * Adapt the cuda kernels.	2023-08-23 10:42:19 +01:00
Patrick von Platen	649202024c	fix code snippets	2023-08-23 09:05:07 +00:00
Patrick von Platen	283f6c048d	fix code snippets	2023-08-23 09:04:36 +00:00
Patrick von Platen	c8211fc474	fix code snippets	2023-08-23 09:04:08 +00:00
Patrick von Platen	7732bf6238	correct	2023-08-23 08:54:48 +00:00
Patrick von Platen	7c0ca80d3a	move installation to book	2023-08-23 08:52:53 +00:00
Patrick von Platen	b558d08b85	improve	2023-08-23 08:42:47 +00:00
Patrick von Platen	34cb9f924f	improve	2023-08-23 08:40:23 +00:00
Patrick von Platen	d4968295a0	improve	2023-08-23 08:37:08 +00:00
Patrick von Platen	65e146c72d	Add installation section	2023-08-23 08:32:59 +00:00
Patrick von Platen	3743bed2d7	Fix the `?` operator cannot be applied to type `Device` of example (#560 ) According to the API: ```rust inp = inp.to_device(&Device::Cuda(0)?)?; ``` cannot work as `Cuda(...)` expects a type `Device` not an integer. I'd recommend to instead use `new_cuda(...)`	2023-08-23 09:29:50 +01:00
Laurent Mazare	508d34daf2	GGUF support in the quantized model. (#559 ) * GGUF support in the quantized model. * Get the GGUF support to work on llama.	2023-08-23 09:20:57 +01:00
Laurent Mazare	0764741cc4	Handle GGUF files in tensor-tools. (#558 )	2023-08-23 06:32:07 +01:00
Laurent Mazare	6a30ecefad	Preliminary GGUF support. (#557 ) * Preliminary GGUF support. * Tensor reading.	2023-08-23 00:14:10 +01:00
Laurent Mazare	7687a0f453	Also fix the aspect ratio in the wasm example. (#556 ) * Also fix the aspect ratio in the wasm example. * Add the yolo lib. * Update the build script.	2023-08-22 22:20:08 +01:00
Laurent Mazare	f9ecc84477	GQA support in the quantized model. (#555 ) * GQA support in the quantized model. * Fix the reshaping. * Fix the main llama model. * Infer the proper gqa from the model kind.	2023-08-22 19:41:10 +01:00
Laurent Mazare	07067b01dc	Avoid some mutable variables (take 2). (#554 ) * Avoid some mutable variables (take 2). * Fix.	2023-08-22 18:51:20 +01:00
Laurent Mazare	cc22d4db20	Put the transcribe token before the language one. (#553 )	2023-08-22 16:46:34 +01:00
Laurent Mazare	ec665acad7	Revert "Avoid some mut in quantized functions. (#550 )" (#552 ) This reverts commit `cf27b9b636`.	2023-08-22 15:57:46 +01:00
Laurent Mazare	cf27b9b636	Avoid some mut in quantized functions. (#550 ) * Avoid a couple more 'let mut'. * Tweaks.	2023-08-22 15:44:26 +01:00
Lukas Kreussel	352383cbc3	Add quantization support for `q2k`, `q3k`, `q4k` and `q5k` (#524 ) * first q2 implementation * First Q4K and Q5K implementations * fix `q2k` and `q5k` * Some first cleanups * run `clippy` on tests * finally implement `q3k` * deactivate `q3k` test on macos * also disable the test on linux * Fix floating bits in `q3k` dequantization * Refactoring pass + reorder quants in file * `fmt` * Re-add `src` asserts and redefine `dst`	2023-08-22 15:04:55 +01:00
Laurent Mazare	9bc811a247	Improve the aspect ratio handling on yolo-v8. (#549 ) * Fix the aspect ratio handling in yolo-v8. * Typo.	2023-08-22 14:55:33 +01:00
Laurent Mazare	bb69d89e28	Move the yolo shared bits to a common place. (#548 ) * Move the yolo shared bits to a common place. * Share more code. * Configurable thresholds.	2023-08-22 13:03:07 +01:00
Laurent Mazare	20ce3e9f39	Sketch the yolo wasm example. (#546 ) * Sketch the yolo wasm example. * Web ui. * Get the web ui to work. * UI tweaks. * More UI tweaks. * Use the natural width/height. * Add a link to the hf space in the readme.	2023-08-22 11:56:43 +01:00
Laurent Mazare	44420d8ae1	Add some llama-v2 variants. (#545 )	2023-08-22 08:35:15 +01:00
Laurent Mazare	f16bb97401	Use the yolo-v8 weights from the hub. (#544 ) * Use the weights from the hub. * Add to the readme.	2023-08-21 22:07:36 +01:00
Laurent Mazare	3507e14c0c	Yolo v8 fixes (#542 ) * Fixes for the yolo-v8 layout. * Bugfixes. * Another silly bugfix. * Remove the hf-hub dependency. * Remove the transformers dependency.	2023-08-21 21:05:40 +01:00
Laurent Mazare	de50e66af1	Add yolo v8 as an example (#541 ) * Sketching yolo-v8. * Get the model to load. * yolo-v8 forward pass. * Complete(?) the forward pass. * Fix some shape issues. * Add the missing padding. * Process the predictions.	2023-08-21 18:40:09 +01:00
Laurent Mazare	cc2d6cf2e0	Improve the timestamps support in whisper (#539 ) * Timestamp support for whisper. * Properly display the timestamps. * Bugfix for the timestamp units.	2023-08-21 12:26:59 +01:00

1 2 3 4 5 ...

966 Commits