candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Nicolas Patry	38ac50eeda	Adding some doc + Extended `stack` to work with extra final dimensions.	2023-07-10 14:51:10 +02:00
Nicolas Patry	204618b7d3	Merge pull request #118 from LaurentMazare/readme_update Expanding a bit the README	2023-07-10 13:12:23 +02:00
Nicolas Patry	868743b8b9	Expanding a bit the README	2023-07-10 12:51:37 +02:00
Laurent Mazare	89a5b602a6	Move the conv1d layer to candle_nn. (#117 )	2023-07-10 11:02:06 +01:00
Laurent Mazare	b06e1a7e54	[nn] Move the Embedding and Activation parts. (#116 ) * Share the Embedding and Activation parts. * Tweak some activations.	2023-07-10 10:24:52 +01:00
Laurent Mazare	9ce0f1c010	Sketch the candle-nn crate. (#115 ) * Sketch the candle-nn crate. * Tweak the cuda dependencies. * More cuda tweaks.	2023-07-10 08:50:09 +01:00
Laurent Mazare	bc3be6f9b0	Add the elu cuda kernel. (#114 )	2023-07-10 07:57:01 +01:00
Laurent Mazare	270997a055	Add the elu op. (#113 )	2023-07-09 21:56:31 +01:00
Laurent Mazare	ea5dfa69bc	Sketching the musicgen model. (#66 ) * Skeleton files for musicgen. * Add a musicgen model module. * Sketch the model loading. * Start adding the forward pass. * More forward pass. * Positional embeddings. * Forward for the decoder layers. * Add an empty function. * Fix the musicgen weight names. * More musicgen modeling. * Add the T5 loading bits. * Add the encodec config. * Add the encodec module hierarchy. * More Encodec modeling. * Encodec modeling. * Encodec modeling. * Add more to the encodec modeling. * Load the weights. * Populate the resnet blocks. * Also load the conv transpose weights. * Split musicgen in multiple files.	2023-07-09 19:53:35 +01:00
Laurent Mazare	c187f347bf	Make it easier to use whisper samples from the repo. (#112 ) * Make it easier to use samples from the repo. * Use f32 for accumulation in the f16/bf16 kernels.	2023-07-08 18:48:27 +01:00
Laurent Mazare	eb64ad0d4d	Cuda kernel for the conv1d op (#111 ) * Boilerplate code for conv1d. * Boilerplate code for conv1d. * More boilerplate for conv1d. * Conv1d work. * Get the conv1d cuda kernel to work. * Conv1d support when no batch dim.	2023-07-08 18:13:25 +01:00
Laurent Mazare	5c3864f9f7	Add more sum tests. (#110 ) * Add some tests for the sum. * More sum testing.	2023-07-08 13:15:36 +01:00
Laurent Mazare	e676f85f00	Sketch a fast cuda kernel for reduce-sum. (#109 ) * Sketch a fast cuda kernel for reduce-sum. * Sketch the rust support code for the fast sum kernel. * More work on the fast kernel. * Add some testing ground. * A couple fixes for the fast sum kernel.	2023-07-08 12:43:56 +01:00
Laurent Mazare	33479c5f1b	Add some very simple sum benchmark. (#108 ) * Add some very simple sum benchmark. * Rename the file.	2023-07-08 08:39:27 +01:00
Laurent Mazare	f35cfc5e97	Sample with temperature. (#106 )	2023-07-07 18:12:25 +01:00
Laurent Mazare	03dffe9ecc	Use F32 for the reduce ops. (#105 )	2023-07-07 17:55:21 +01:00
Laurent Mazare	e923b3adc2	Add a KV cache to falcon. (#104 )	2023-07-07 17:24:38 +01:00
Laurent Mazare	05ff1cff66	Add some caching to the causal mask. (#103 )	2023-07-07 12:56:44 +01:00
Nicolas Patry	65937612d0	Merge pull request #91 from LaurentMazare/tweak_parallel_download Getting tokio tasks stuck on smaller machines.	2023-07-07 09:43:55 +02:00
Nicolas Patry	2df044f9a1	Clippy after rebase.	2023-07-07 09:22:09 +02:00
Nicolas Patry	1ec221a749	Fixing falcon example.	2023-07-07 09:13:55 +02:00
Nicolas Patry	514b171f75	Getting tokio tasks stuck on smaller machines.	2023-07-07 09:13:28 +02:00
Laurent Mazare	d38a926c14	Convert the logits to f32 before extracting them. (#102 )	2023-07-07 08:07:57 +01:00
Laurent Mazare	02b5c38049	Use cublas bf16. (#101 )	2023-07-07 08:00:12 +01:00
Laurent Mazare	c71a38deb7	Tweak the include order to include math.h first. (#100 )	2023-07-07 06:47:25 +01:00
Laurent Mazare	f114394456	Include the math.h file to get access to constants. (#99 )	2023-07-07 06:42:57 +01:00
Laurent Mazare	bac4ef40f3	Add some text generation pipeline for falcon. (#98 )	2023-07-07 06:34:22 +01:00
Laurent Mazare	2b8e8c9f14	Bugfixes. (#97 )	2023-07-06 23:26:11 +01:00
Laurent Mazare	a3f3b93d16	Add the call to dense in the attention layer. (#96 )	2023-07-06 23:22:08 +01:00
Nicolas Patry	0a2c82e301	Merge pull request #92 from LaurentMazare/sync_hub Creating new sync Api for `candle-hub`.	2023-07-07 00:10:47 +02:00
Nicolas Patry	666c6f07a1	Merge pull request #88 from LaurentMazare/fix_unsafe_loads Fixing unsafe slow load (memcpy).	2023-07-07 00:09:56 +02:00
Nicolas Patry	ce27073feb	Update candle-core/src/safetensors.rs	2023-07-06 23:59:54 +02:00
Laurent Mazare	0f679fe42e	Fix some shape issues in falcon. (#95 ) * Fix some shape issues. * Use different dtypes.	2023-07-06 19:23:54 +01:00
Laurent Mazare	4afa461b34	Sketch the Falcon model. (#93 ) * Sketch the Falcon model. * Add more substance to the falcon example. * Falcon (wip). * Falcon (wip again). * Falcon inference. * Get the weights from the api and properly generate the model. * Use the proper model. * Fix the file/revision names. * Fix bias handling. * Recompute the rot embeddings. * Fix the input shape. * Add the release-with-debug profile. * Silly bugfix. * More bugfixes. * Stricter shape checking in matmul.	2023-07-06 19:01:21 +01:00
Nicolas Patry	cae9212b70	Merge pull request #89 from LaurentMazare/extending_bert Enabling `roberta` for the example (it's the same model as Bert, with just different naming.)	2023-07-06 16:29:26 +02:00
Nicolas Patry	115629fe08	Creating new sync Api for `candle-hub`. - `api::Api` -> `api::tokio::api` (And created new `api::sync::Api`). - Remove `tokio` from all our examples. - Using similar codebase for now instead of ureq (for simplicity).	2023-07-06 15:15:25 +02:00
Laurent Mazare	f1e29cd405	Allow using mkl in tests. (#90 )	2023-07-06 13:25:05 +01:00
Nicolas Patry	3f291bdf9d	Enabling `roberta` for the example (it's the same model as Bert, with just different naming.)	2023-07-06 13:25:21 +02:00
Nicolas Patry	054717e236	Fixing unsafe slow load (memcpy). - Without the annotation, I think the rust compiler assumes it's all u8. It did segfault trying to load `Roberta`.	2023-07-06 13:14:33 +02:00
Laurent Mazare	dd60bd84bb	MKL adjustments. (#87 )	2023-07-06 11:37:27 +01:00
Laurent Mazare	c297a50960	Add mkl support for matrix multiply. (#86 ) * Fix some rebase issues. * Use mkl instead. * Use mkl in bert. * Add the optional mkl feature. * Conditional compilation based on the mkl feature. * Add more mkl support.	2023-07-06 11:05:05 +01:00
Laurent Mazare	cd230d26fe	Whisper tweaks (#85 ) * Isolate the decoding bits of the whisper example. * Decode -> Decoder. * Add the suppress tokens filter. * More suppress tokens.	2023-07-06 09:13:20 +01:00
Laurent Mazare	be9b493a75	Merge pull request #84 from LaurentMazare/whisper-cosmetic Add the original whisper names as comment.	2023-07-06 07:57:46 +01:00
laurent	d3418f1cff	Add the original whisper names as comment.	2023-07-06 07:57:03 +01:00
Laurent Mazare	19ab5ea411	Merge pull request #78 from LaurentMazare/whisper_update Adapting whisper for Hub use.	2023-07-06 07:21:58 +01:00
Laurent Mazare	80d51ca088	Merge pull request #83 from LaurentMazare/dim-index-cat Support dim indexes in cat.	2023-07-05 20:43:05 +01:00
laurent	e2bfbcb79c	Support dim indexes in cat.	2023-07-05 20:39:08 +01:00
Laurent Mazare	fc2ffcc72b	Merge pull request #82 from LaurentMazare/dim-index Add a simpler way to specify the dim index for some ops.	2023-07-05 20:24:43 +01:00
laurent	2c3d871b2e	Add a simpler way to specify the dim index for some ops.	2023-07-05 20:22:43 +01:00
Nicolas Patry	b7388bbf71	Merge pull request #81 from LaurentMazare/fix_kernel_build Fixing the cached build.	2023-07-05 18:21:40 +02:00

1 2 3 4 5 ...

536 Commits