candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	783735cf22	Use softmax-last-dim where possible. (#1057 )	2023-10-08 13:16:42 +01:00
Laurent Mazare	9abeddd750	Make the cuda rng seedable. (#1056 )	2023-10-08 09:32:36 +01:00
Laurent Mazare	2e5fb0b251	Do not use the kv-cache on external key-value states. (#1054 )	2023-10-07 22:37:19 +01:00
Laurent Mazare	823fe23f9b	Add flash-attn support for stable-lm. (#1052 )	2023-10-07 21:12:54 +01:00
Laurent Mazare	d833527fda	Use candle_nn::LSTM in encodec. (#1051 ) * Use candle_nn::LSTM in encodec. * More Encodec implementation. * Decoder implementation.	2023-10-07 19:43:06 +01:00
Laurent Mazare	a4967600d0	More general seq forward functions for RNNs. (#1050 )	2023-10-07 15:08:01 +01:00
Laurent Mazare	aa53368aeb	Better control on the optional dequantization in QMatMul (#1049 ) * Cosmetic change to the quantized whisper model. * Fix the dequantization. * Add the dequantize all variable.	2023-10-07 10:16:18 +01:00
Laurent Mazare	955e00b2e8	Add to the readmes for stable-lm. (#1047 )	2023-10-06 21:26:04 +01:00
Laurent Mazare	d5f7267087	Add the stable-lm example. (#1046 ) * Add the stable-lm example. * Get stable-lm to generate some proper text.	2023-10-06 19:20:35 +01:00
Lukas Kreussel	904bbdae65	Make the Python Wrapper more Hackable and simplify Quantization (#1010 ) * Some first `Module` implementations * Add `state_dict` and `load_state_dict` functionality * Move modules around and create `candle.nn.Linear` * Add `nn.Embedding` and `nn.LayerNorm` * Add BERT implementation * Batch q-matmul * Automatically dequantize `QTensors` if a `Tensor` is expected * Add Module `.to()`, `.cuda()`, `cpu()` and `.type()` functionality * Unittests for `Module`, `Tensor` and `candle.utils` * Add `pytorch` like slicing to `Tensor` * Cleanup and BERT fixes * `black` formatting + unit-test for `nn.Linear` * Refactor slicing implementation	2023-10-06 19:01:07 +01:00
Laurent Mazare	b0442eff8a	Sketch the stable-lm model. (#1045 )	2023-10-06 18:19:06 +01:00
Laurent Mazare	4631c48273	Remove some todos. (#1042 )	2023-10-05 22:42:20 +01:00
Laurent Mazare	716883e9b0	Add the clamping for stable-diffusion. (#1041 )	2023-10-05 22:20:39 +01:00
lichin-lin	47c25a567b	feat: [SAM] able to download the result as png (#1035 ) * feat: able to download the result as png * feat: update function and wording	2023-10-05 22:14:47 +01:00
Laurent Mazare	7f7d95e2c3	Add the round-to function. (#1039 )	2023-10-05 20:28:09 +01:00
Juarez Bochi	f47bd9bab5	Delete invalid comment (#1038 )	2023-10-05 19:28:08 +01:00
Gonzalo	8f7973958c	fix: fix index_select cuda kernel for src target dim different than ids dim when selecting dim > 0 (#1037 ) * fix: fix index_select cuda kernel for src target dim different than ids dim when selecting dim > 0 * cargo fmt	2023-10-05 18:46:13 +01:00
Laurent Mazare	f0c619a4af	Use AsRef<str> for set_one. (#1033 )	2023-10-05 06:05:44 +01:00
Juarez Bochi	b86ac0c507	Quant t5: Add coedit model to wasm demo and readme (#1031 )	2023-10-04 20:57:33 +01:00
Radamés Ajna	27e70a5093	Whisper quantized wasm (#1028 ) * [Whisper] Update to use quantized model * [whisper] add language detection * [whisper] change assets location * [whisper] adapt js example with quantized models * [whisper] better task parsing * [whisper] minor fixes	2023-10-04 20:22:57 +01:00
Laurent Mazare	c18a856e76	Add the rounding operators. (#1030 ) * Add the rounding operators. * Avoid tracking gradients for the rounding operations. * Add some rounding tests.	2023-10-04 17:58:44 +01:00
Juarez Bochi	3349c89252	Add quantized t5 args for weight and config (#1029 )	2023-10-04 17:02:49 +01:00
Laurent Mazare	11d3687cc6	Simd128 optimized q8k vecdot. (#1026 )	2023-10-03 15:29:48 +01:00
Laurent Mazare	dac73edb34	AVX optimized q8k vecdot. (#1024 )	2023-10-03 12:10:58 +01:00
Nicolas Patry	b4da19d1be	Merge pull request #1023 from evgenyigumnov/simlified-book-polish small misspeling and polish fix	2023-10-03 12:29:41 +02:00
Evgeny Igumnov	ff513314fc	small misspeling and polish fix	2023-10-03 15:47:04 +06:00
Laurent Mazare	043cc25766	Fix for the index-select cuda setup. (#1022 ) * Fix for index-select. * Better fix + add some testing.	2023-10-03 10:21:46 +01:00
Nicolas Patry	7b06872f90	Merge pull request #926 from evgenyigumnov/book-trainin-simplified Book train simlified example	2023-10-03 10:41:30 +02:00
Radamés Ajna	65825e7240	[SAM] Add undo button and background point mode (#1020 ) * [SAM] Add undo button and background point mode * [SAM] remove pts on near clicks * [SAM] check shiftKey toggle point mode * [SAM] clear points when clearing image	2023-10-02 23:33:46 +01:00
Laurent Mazare	7670fe7d1f	neon optimized q8k multiplication. (#1021 ) * neon optimized q8k multiplication. * Bugfixes. * simdification.	2023-10-02 23:26:34 +01:00
Laurent Mazare	cddfc3944c	Add the q8k vec-dot multiplication. (#1019 )	2023-10-02 21:53:34 +01:00
Laurent Mazare	089fc3b584	Improve the quantized whisper setup. (#1018 ) * Improve the quantized whisper setup. * Fix the config file paths. * Use the standard matmul where possible.	2023-10-02 17:17:46 +01:00
Laurent Mazare	e04c789230	Add a quantized variant of whisper (#1017 ) * Add the quantized-whisper model. * Quantized the whisper model. * Adapt the whisper example to handle quantization. * Add the quantized flag. * Load the proper weights.	2023-10-02 14:59:53 +01:00
Laurent Mazare	263a172202	Improve the testing of the optimized quantized vec-dot ops (#1016 ) * Expose the unopt functions for testing. * Better testing of the optimized quantized computations.	2023-10-02 09:50:43 +01:00
Nicolas Patry	638ccf9f46	Fix include code.	2023-10-02 10:22:44 +02:00
Nicolas Patry	0baf5a1e19	Fixed PR warnings.	2023-10-02 10:15:10 +02:00
Laurent Mazare	5130a7da32	Simd128 version of q6k vec-dot. (#1015 ) * Add a specific function for the simd128 q6k vec-dot. * Simdification. * More simdification.	2023-10-01 19:44:12 +01:00
lichin-lin	41143db1af	[segment-anything] add multi point logic for demo site (#1002 ) * [segment-anything] add multi point logic for demo site * [segment-anything] remove libs and update functions	2023-10-01 18:25:22 +01:00
Laurent Mazare	096dee7073	Bump the version to 0.3.0. (#1014 ) * Bump the version to 0.3.0. * Changelog update.	2023-10-01 13:51:57 +01:00
Laurent Mazare	f6054e9d60	Fix the prompt for mistral when using instruct/interactive mode. (#1013 )	2023-10-01 06:44:30 +01:00
Laurent Mazare	328167ec04	Integrate TheBloke quantized mistral weights. (#1012 )	2023-09-30 22:39:42 +01:00
Laurent Mazare	4e55aaa51f	Simd128 version of the q2k-q8k vecdot product. (#1011 ) * Sketch the simd128 version of q2k vecdot. * Use a single accumulator. * Simdify the q2k-q8k vecdot product. * Cosmetic change.	2023-09-30 20:12:41 +01:00
Laurent Mazare	deee7612da	Quantized version of mistral. (#1009 ) * Quantized version of mistral. * Integrate the quantized mistral variant. * Use the quantized weight files. * Tweak the quantization command. * Fix the dtype when computing the rotary embeddings. * Update the readme with the quantized version. * Fix the decoding of the remaining tokens.	2023-09-30 18:25:47 +01:00
Laurent Mazare	06207332bc	Streaming mode for reporting the generated tokens (#1007 ) * Token streaming. * Use the token output stream. * Flush the output. * Ensure that the last characters get reported.	2023-09-30 15:04:11 +01:00
Laurent Mazare	4021272875	Use flash-attn for mistral. (#1004 )	2023-09-30 12:15:10 +01:00
Laurent Mazare	87e3a4e175	Mistral: exit on eos token. (#1001 ) * Mistral: exit on eos token. * Print the proper stats. * Also add a short flag.	2023-09-30 07:07:06 +01:00
Laurent Mazare	6203ced495	Add negative prompts to segment-anything. (#1000 )	2023-09-30 06:17:42 +01:00
GeauxEric	34842fb234	[segment-anything] Print IOU values to help with debugging (#999 )	2023-09-30 05:44:42 +01:00
Laurent Mazare	d188d6a764	Fix the multiple points case for sam. (#998 )	2023-09-29 22:39:43 +02:00
Laurent Mazare	0ac2db577b	Add an entry about WSL slowness to the faq. (#997 )	2023-09-29 17:04:52 +01:00

1 2 3 4 5 ...

1442 Commits