candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 02:38:10 +00:00

Author	SHA1	Message	Date
Harry Stern	c119600d6e	Move image tensor to device in trocr example (#2063 ) Signed-off-by: Harry Stern <harry@harrystern.net>	2024-04-15 06:50:32 +02:00
Laurent Mazare	50e49ecc5f	Add a quantized version of recurrent-gemma. (#2054 ) * Add a quantized version of recurrent-gemma. * Share the rglru part. * Get the quantized gemma model to work.	2024-04-13 20:07:01 +02:00
Laurent Mazare	26cbbf8d84	Mandatory topk sampling for recurrent-gemma. (#2051 )	2024-04-13 10:31:39 +02:00
Laurent Mazare	2bf413caa3	Add the recurrent-gemma model. (#2039 ) * Start adding the recurrent-gemma model. * More griffin. * Add the example + get the weights to load from the HF version. * More inference code. * Rope + kv-cache on the attention side. * Add to the inference code. * Add more to the recurrent gemma inference. * Get some first inference to run. * Add the softcap on logits. * Fixes. * Use partial rotary embeddings. * Get inference to work. * Add a comment. * And add a readme.	2024-04-13 00:05:21 +02:00
Laurent Mazare	a0460cd2b1	Add the code-gemma models. (#2038 ) * Add the code-gemma models. * Tweak to the gemma config.	2024-04-10 21:19:21 +02:00
Laurent Mazare	b81ecf712d	Support alternative dtypes for mamba (#2036 ) * Allow different dtypes in mamba. * Add a dtype flag.	2024-04-10 18:10:01 +02:00
Laurent Mazare	7f354473cf	Optimize copy-2d for metal. (#2024 ) * Optimize copy-2d for metal. * Add a hacky stopping rule for moondream.	2024-04-07 12:34:16 +02:00
Laurent Mazare	33c9b66554	Add the new gemma models. (#2023 ) * Add the new gemma models. * Revert the lightning changes. * Support for the 1.1 models.	2024-04-06 21:25:38 +02:00
Santiago Medina	ace282e5c2	Add flag to run Moondream in f16 precision (#2015 ) * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Use latest release special token; Fix token/s accuracy; Use GeluPytorchTanh in VisionConfig v2 * Add flag to use f16 * Avoid breaking the quantized version on cuda. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-04-05 07:03:33 +02:00
Laurent Mazare	c87381fc96	Use F16 for moondream on cuda. (#2013 )	2024-04-04 23:30:10 +02:00
Laurent Mazare	f48c07e242	Include topk sampling in the quantized example. (#2005 ) * Include topk sampling in the quantized example. * Also sample with top-k on the mistral side.	2024-04-04 09:27:54 +02:00
Santiago Medina	d17b2cdad9	Match Moondream's latest release (#1997 ) * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Use latest release special token; Fix token/s accuracy; Use GeluPytorchTanh in VisionConfig v2	2024-04-02 21:37:09 +02:00
Laurent Mazare	be9c200cbb	Expose the t5 config fields + allow t5-large. (#1987 )	2024-04-01 20:58:34 +02:00
Santiago Medina	ea0d8d3753	Quantized moondream implementation and BOS token (#1980 ) * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward * Pass bos token at the beginning of tensor. * Quantize moondream. * Forward with image bos token. * Clippy. * Use q4_0 quantization. * Add pointers for sequence and tokens; Remove seq_len conditional	2024-04-01 19:37:54 +02:00
Laurent Mazare	b20acd622c	Update for pyo3 0.21. (#1985 ) * Update for pyo3 0.21. * Also adapt the RL example. * Fix for the pyo3-onnx bindings... * Print details on failures. * Revert pyi.	2024-04-01 17:07:02 +02:00
Laurent Mazare	c7557b65dc	Switch the default to using the faster kernels. (#1978 ) * Switch the default to using the faster kernels. * Add the force-dmmv flag.	2024-04-01 10:00:11 +02:00
Laurent Mazare	cd29c7ccd4	More ggml cuda kernels (#1977 ) * Add more cuda kernels for quantized matmul. * Add the vec-dot bits. * Expose the quantized matmul-vec kernels. * Also include the quantize-q8-1 kernel. * Glue code for the q8-1 quantization. * mm-vec product via q8-1 quantization. * Add a test. * Add a mm test. * Get the test to return some sensible results. * Also test dmmv. * Fix the launch params. * Allow for tweaking the force_dmmv parameter while it's experimental.	2024-04-01 00:15:48 +02:00
Laurent Mazare	f9954b73ba	Add options to use local files + specify a custom repo or branch. (#1973 )	2024-03-31 09:32:50 +02:00
Laurent Mazare	eead1dcead	Clippy fix. (#1972 )	2024-03-31 08:57:40 +02:00
Santiago Medina	92f81d2fcb	Add Moondream transformer implementation and example (#1970 ) * moondream implementation * add moondream example * change config default activation * Add assets and integrate phi mixformer with example * Make use of kv cache and fix seq_len bug; Clean up example code * Add README link to example * Remove pos_embed scaling; Remove assets; Add to README; Expand VisionConfig * Delete image * Use apply instead of forward	2024-03-31 08:54:56 +02:00
Laurent Mazare	3144150b8d	Move the tensor-tools binary in a separate crate. (#1969 )	2024-03-30 15:49:37 +01:00
Laurent Mazare	8ad12a0e81	Add some examples using the MT5 variants. (#1963 )	2024-03-29 18:09:29 +01:00
Laurent Mazare	eb1b27abcd	Readme fix. (#1961 )	2024-03-28 23:24:46 +01:00
Laurent Mazare	708e422456	Qwen MoE model. (#1960 ) * Qwen MoE model. * Add the MoE model to the example. * Fix the scaling. * Readme updates. * Readme tweaks.	2024-03-28 23:10:57 +01:00
Laurent Mazare	c5092f2c29	Add a couple t5 models. (#1958 )	2024-03-28 17:58:06 +01:00
Tigran Zhampeissov	b0340d72ec	CLIP model implementation with example (#1950 ) * CLIP model implementation with example * CLIP Implementation fixes, batch images * CLIP model remove images from git * CLIP model remove unnecessary use of batch_indices	2024-03-28 13:44:12 +01:00
Laurent Mazare	e2b4829531	Support more mistral models. (#1927 ) * Support more mistral models. * Use the appropriate rope parameter.	2024-03-24 08:04:04 +01:00
Laurent Mazare	a00e24d752	Improve the error message on overlong prompts. (#1908 )	2024-03-21 21:08:07 +01:00
Sanchit Gandhi	bb3ee48039	whisper readme (#1899 )	2024-03-21 12:54:09 +01:00
Sanchit Gandhi	0c11e055be	support distil-large-v3 (#1898 )	2024-03-21 11:46:49 +01:00
Laurent Mazare	18036c6ccb	Update the image crate + use the re-exported version. (#1893 ) * Update the image crate + use the re-exported version. * Update to using ab_glyph.	2024-03-21 10:56:41 +01:00
Laurent Mazare	455c42aa72	Avoid copying the data on squeeze and unsqueeze. (#1884 ) * Avoid copying the data on squeeze and unsqueeze. * Fix the quantized llama example. * Unrelated fix for the quantized stable-lm example on cuda. * Fix for mamba on cuda (unrelated to the PR).	2024-03-20 13:04:36 +01:00
Laurent Mazare	f115895b9e	Apply rustfmt. (#1873 )	2024-03-18 21:43:31 +01:00
Gabriel	6a966cf9e0	Add a DQN example to the reinforcement-learning section (#1872 )	2024-03-18 21:22:53 +01:00
Laurent Mazare	58605252e8	Microphone support for the encodec example. (#1866 )	2024-03-18 11:19:46 +01:00
Laurent Mazare	d365ef32d9	Improve the encodec example: handle resampling. (#1865 ) * Improve the encodec example: handle resampling. * Play the audio directly.	2024-03-18 10:09:40 +01:00
Laurent Mazare	a15f859ab4	Fix for the encodec example. (#1861 )	2024-03-17 21:15:12 +01:00
Laurent Mazare	74bf6994b1	Move the image tensor to the appropriate device. (#1856 )	2024-03-16 22:25:46 +01:00
Jani Monoses	e1f9c3776d	StableLM-2 models were updated to use GPT-2 tokenization. (#1847 )	2024-03-14 21:01:36 +01:00
Tyler Rockwood	3318fe30fb	Update gemma README (#1843 ) * Update gemma README * Fixit	2024-03-13 21:41:36 +01:00
Laurent Mazare	56c9d3ee7b	Fix the model path for rwkv. (#1825 )	2024-03-09 11:21:48 +01:00
Laurent Mazare	dd00482ea3	Quantized version of the metavoice model. (#1824 ) * Quantized version of the metavoice model. * Integrate the quantized version of metavoice.	2024-03-09 11:06:04 +01:00
Laurent Mazare	3440cec3a0	Fast CPU kernel for transposed 1d convolutions. (#1822 ) * Fast CPU kernel for transposed 1d convolutions. * Bugfix.	2024-03-08 22:43:07 +01:00
Niklas Hallqvist	0a3487a776	Add a --seed argument to the stable-diffusion example. (#1812 ) * Add a --seed argument to the stable-diffusion example. * Make the case when no seed is specified, that it will not be set, but use the engine's default. This will make the CPU engine work again when no --seed is given, and will cause a bailout when a seed is there, as the engine does not currently support it. --------- Co-authored-by: niklas <niklas@appli.se>	2024-03-08 08:17:36 +01:00
Laurent Mazare	8a99cf7dd2	Add a flag to select the dtype used in metavoice. (#1805 )	2024-03-05 12:16:00 +01:00
Jiayu Liu	924ccae30c	Add an initial Segformer implementation (#1617 ) * add segformer * Make the id2label field optional. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-03-03 16:01:46 +01:00
Laurent Mazare	60dc72b96b	More metavoice tweaks. (#1796 )	2024-03-03 15:05:25 +01:00
Laurent Mazare	20abb72fec	Normalize loudness of the generated audio (#1795 ) * Normalize loudness of the generated audio. * Lints. * One more lint. * Avoid running the bs1770 tests. * Another attempt at discarding doc comments. * Also normalize the loudness in the encodec example.	2024-03-03 14:00:42 +01:00
Laurent Mazare	ca5d727ba2	Use the same padding in metavoice as in the python version. (#1794 )	2024-03-03 12:04:48 +01:00
Laurent Mazare	09e0148cce	Tweaks to run metavoice on metal (#1792 ) * Enable tanh + tweak conv-transpose. * Run the encodec decoding on cpu. * Clippy fixes.	2024-03-03 07:46:44 +01:00

... 2 3 4 5 6 ...

789 Commits