candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 02:38:10 +00:00

Author	SHA1	Message	Date
Jani Monoses	e1f9c3776d	StableLM-2 models were updated to use GPT-2 tokenization. (#1847 )	2024-03-14 21:01:36 +01:00
Tyler Rockwood	3318fe30fb	Update gemma README (#1843 ) * Update gemma README * Fixit	2024-03-13 21:41:36 +01:00
Thomas Santerre	2bb9c683b9	Update README.md (#1840 ) Adds the candle-einops to the readme as an external resource	2024-03-13 14:36:25 +01:00
Laurent Mazare	ff03fd3fb3	Expose some helper functions to create quantized models. (#1837 )	2024-03-12 11:30:24 +01:00
Laurent Mazare	df5f69444e	Properly handle the batch dimension in cuda quantized matmul. (#1832 )	2024-03-10 20:23:43 +01:00
Laurent Mazare	0c5eecbc0f	Add some tracing to metavoice. (#1826 )	2024-03-09 12:24:11 +01:00
Laurent Mazare	56c9d3ee7b	Fix the model path for rwkv. (#1825 )	2024-03-09 11:21:48 +01:00
Laurent Mazare	dd00482ea3	Quantized version of the metavoice model. (#1824 ) * Quantized version of the metavoice model. * Integrate the quantized version of metavoice.	2024-03-09 11:06:04 +01:00
Laurent Mazare	936f6a4840	Fix dequantization. (#1823 )	2024-03-08 23:12:13 +01:00
Laurent Mazare	3440cec3a0	Fast CPU kernel for transposed 1d convolutions. (#1822 ) * Fast CPU kernel for transposed 1d convolutions. * Bugfix.	2024-03-08 22:43:07 +01:00
Laurent Mazare	e7fc1daa21	Bump the crate versions to 0.4.2. (#1821 )	2024-03-08 22:01:51 +01:00
Niklas Hallqvist	be5b68cd0b	Metal random-generation bug fixes (#1811 ) * use_resource API misunderstood. It is not additive. Several usages must be bit-ORed together. * The seeding was incorrect and used the address instead of the value of the passed in seed. * Add a check that likely exhibits failure to update the seed between generation of random tensors. * Buffer overrun, the length given to the std::ptr::copy call was in bytes, and not 32-bit units. * By default seed the RNG with a time-based value, so that different runs may produce different output, just like the CPU engine. Use device.set_seed if determinism is warranted. * Revert "By default seed the RNG with a time-based value, so that different runs may produce different output, just like the CPU engine. Use device.set_seed if determinism is warranted." This reverts commit `d7302de9` Discussion in https://github.com/huggingface/candle/pull/1811#issuecomment-1983079119 * The Metal random kernel failed to set element N/2 of tensors with N elements, N being even. The reason was that all threads but thread 0 all created 2 random samples, but thread 0 only one, i.e. an odd number. In order to produce an even number of samples, the early termination of thread 0 should only everr occur for odd sized tensors. * Add a test catching any deterministic tensor element in rand and randn output. --------- Co-authored-by: niklas <niklas@appli.se> Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com>	2024-03-08 16:11:50 +01:00
Laurent Mazare	ea984d0421	Expose more printer options. (#1817 )	2024-03-08 15:04:18 +01:00
Laurent Mazare	9634583781	Expose a couple layout methods. (#1816 )	2024-03-08 10:52:22 +01:00
Kirpal Grewal	758366160e	add clone to candle dropout (#1814 )	2024-03-08 08:18:01 +01:00
Niklas Hallqvist	0a3487a776	Add a --seed argument to the stable-diffusion example. (#1812 ) * Add a --seed argument to the stable-diffusion example. * Make the case when no seed is specified, that it will not be set, but use the engine's default. This will make the CPU engine work again when no --seed is given, and will cause a bailout when a seed is there, as the engine does not currently support it. --------- Co-authored-by: niklas <niklas@appli.se>	2024-03-08 08:17:36 +01:00
ivarflakstad	0c09d10f32	Improve metal buffer usage (#1807 ) * Improve metal buffer usage * Clone cpu storage when loading to reduce wait_until_complete calls * Use powers of two for buffer sizes so reuse is more likely. * Select best available buffer by size. * Add count to MetalStorage -> can use buffer with different size Co-authored-by: Chris Fleetwood <christopher.fleetwood@huggingface.co> * Simplify new buffer creation without blit copy. Revert &[] -> Vec * Add documentation on newBufferWithBytes safety / synchronization * Drop unused buffers after command buffer is done syncing. --------- Co-authored-by: Chris Fleetwood <christopher.fleetwood@huggingface.co>	2024-03-07 09:42:34 +01:00
Laurent Mazare	8a99cf7dd2	Add a flag to select the dtype used in metavoice. (#1805 )	2024-03-05 12:16:00 +01:00
Laurent Mazare	bd9ab9bc04	Add a cuda kernel for dequantizing q8_0. (#1804 )	2024-03-05 09:50:37 +01:00
Laurent Mazare	8cc0a183ba	Speaker embeddings computation for metavoice. (#1800 ) * Speaker embeddings computation for metavoice. * Compute the speaker embeddings.	2024-03-04 14:13:01 +01:00
Laurent Mazare	6530932285	Add the new models to the main readme. (#1797 )	2024-03-03 16:25:14 +01:00
Jiayu Liu	924ccae30c	Add an initial Segformer implementation (#1617 ) * add segformer * Make the id2label field optional. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-03-03 16:01:46 +01:00
Laurent Mazare	60dc72b96b	More metavoice tweaks. (#1796 )	2024-03-03 15:05:25 +01:00
Laurent Mazare	20abb72fec	Normalize loudness of the generated audio (#1795 ) * Normalize loudness of the generated audio. * Lints. * One more lint. * Avoid running the bs1770 tests. * Another attempt at discarding doc comments. * Also normalize the loudness in the encodec example.	2024-03-03 14:00:42 +01:00
Laurent Mazare	ca5d727ba2	Use the same padding in metavoice as in the python version. (#1794 )	2024-03-03 12:04:48 +01:00
Laurent Mazare	09e0148cce	Tweaks to run metavoice on metal (#1792 ) * Enable tanh + tweak conv-transpose. * Run the encodec decoding on cpu. * Clippy fixes.	2024-03-03 07:46:44 +01:00
Laurent Mazare	de11623752	Metavoice position fix (#1791 ) * Add the metavoice transformer. * Sketch the speaker-encoder module. * Adding to the metavoice model. * Start adding the metavoice example. * Get some logits out. * Load the second stage model. * Get the second step to run. * Tweak the example. * Add encodec tilting. * Glue the different bits together. * Fix a shape issue. * Use a constant. * BPE tokenization. * Fix the position index in metavoice.	2024-03-02 21:00:35 +01:00
Laurent Mazare	21f1d04976	Add the instruction finetuned gemma variants. (#1790 )	2024-03-02 18:56:59 +01:00
Laurent Mazare	4fff5b51f5	Metavoice - first cut (#1717 ) * Add the metavoice transformer. * Sketch the speaker-encoder module. * Adding to the metavoice model. * Start adding the metavoice example. * Get some logits out. * Load the second stage model. * Get the second step to run. * Tweak the example. * Add encodec tilting. * Glue the different bits together. * Fix a shape issue. * Use a constant. * BPE tokenization. * Add a warning.	2024-03-02 18:50:01 +01:00
Laurent Mazare	314630638d	Rustfmt fix. (#1788 )	2024-03-02 10:35:07 +01:00
Frkri	3e3def4134	Update StableLM config (#1787 )	2024-03-02 09:56:57 +01:00
Jack Shih	6980774a91	fix rwkv example eos token (#1785 )	2024-03-01 10:22:28 +01:00
Laurent Mazare	64d4038e4f	Mention rwkv v6 in the readmes. (#1784 )	2024-03-01 08:58:30 +01:00
Jani Monoses	979deaca07	EfficientVit (MSRA) model (#1783 ) * Add EfficientVit (Microsoft Research Asia) model. * Mention models in README	2024-03-01 08:53:52 +01:00
Jack Shih	b485e4b6ee	add models of rwkv v6 and quantized rwkv v6 (#1781 ) * add models of rwkv v6 and quantized rwkv v6 * fix ci clippy fail	2024-03-01 08:37:56 +01:00
laurent	2c95b7394a	Handle Q5_0 and Q5_1 quants in cuda.	2024-02-29 10:54:01 +01:00
Laurent Mazare	4fd00b8900	Add the StarCoder2 model. (#1779 ) * Add the StarCoder2 model. * Add the example code and get things to work. * And also tweak the readme.	2024-02-28 21:02:41 +01:00
Laurent Mazare	57267cd536	Add a flag to force running the quantized model on CPUs. (#1778 ) * Add a flag to force running the quantized model on CPUs. * Add encodec to the readme.	2024-02-28 14:58:42 +01:00
Laurent Mazare	60ee5cfd4d	Support more modes in the encodec example. (#1777 ) * Support more modes in the encodec example. * Remove the old encodec model from the musicgen bits.	2024-02-28 09:22:33 +01:00
Laurent Mazare	56e44aabe3	Make some dependencies optional in the examples. (#1776 )	2024-02-28 07:17:03 +01:00
Laurent Mazare	d0aca6c3c6	Encodec encoding demo. (#1775 )	2024-02-28 06:49:03 +01:00
Laurent Mazare	15e8644149	Apply dilations in the encodec model. (#1772 ) * Apply dilations in the encodec model. * Add some encoding bits.	2024-02-27 23:26:35 +01:00
Laurent Mazare	0c49e95dfb	Encodec model. (#1771 ) * Encodec model. * Fixes. * Add the padding functions. * Get the LSTM bit to work. * Get the encodec model to generate some tokens (decoder only for now). * Minor tweak. * Minor tweak.	2024-02-27 22:59:40 +01:00
Laurent Mazare	205767f9de	Avoid tensor copying in the quantized example. (#1770 )	2024-02-27 20:32:30 +01:00
Laurent Mazare	5e526abc8c	Bump the version number to 0.4.1. (#1768 ) * Fix the block size for some cuda kernels. * Bump the version number to 0.4.1.	2024-02-27 14:19:59 +01:00
Laurent Mazare	6400e1b0a0	Fix the block size for some cuda kernels. (#1767 )	2024-02-27 14:08:33 +01:00
Laurent Mazare	32544a2ad6	Add an option to split the prompt. (#1766 )	2024-02-27 11:24:11 +01:00
Laurent Mazare	badf886583	Cuda kernel for dequantizing q8k. (#1760 ) * Cuda kernel for dequantizing q8k. * Clippy lints.	2024-02-26 08:42:44 +01:00
Jack Shih	918136ba46	add quantized rwkv v5 model (#1743 ) * and quantized rwkv v5 model * Integrate the quantized rwkv model in the initial example. --------- Co-authored-by: laurent <laurent.mazare@gmail.com>	2024-02-25 21:43:40 +01:00
Laurent Mazare	1a6043af51	Tweak the VarMap set type. (#1758 )	2024-02-25 20:50:08 +01:00

1 2 3 4 5 ...

1846 Commits