candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 02:38:10 +00:00

Author	SHA1	Message	Date
Nicolas Patry	d7a273be51	Training: - Removed a lot of surface (SerializedFileReader ownership is really painful). - Moved example + vision to hf.co version. - Removed feature gate.	2023-08-28 15:15:01 +02:00
Laurent Mazare	a3f97c143d	Bump the crate version + update CHANGELOG. (#628 )	2023-08-27 18:17:11 +01:00
Laurent Mazare	0afbc435df	Add some configurable legend for yolo detection. (#603 ) * Add some configurable legend for yolo detection. * Clippyness.	2023-08-25 13:50:31 +01:00
Laurent Mazare	97909e5068	Move the yolo model bits in a separate file. (#602 ) * Move the yolo model bits in a separate file. * Improve the drawing. * Bugfix.	2023-08-25 12:47:55 +01:00
Laurent Mazare	aba1e90797	Add some group parameter to convolutions. (#566 ) * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.	2023-08-23 12:58:55 +01:00
Laurent Mazare	a8f61e66cc	Bump the crates version to 0.1.2. (#522 )	2023-08-20 08:07:07 +01:00
Laurent Mazare	b9661a1c25	Enable the image crate by default in examples (#501 ) * Enable the image crate by default so that it's easier to compile the stable diffusion example. * Also update the readme.	2023-08-18 10:00:05 +01:00
Laurent Mazare	531f23b4d0	Rename vec-dot to vec-ops. (#449 ) * Rename vec-dot to vec-ops. * Also bump the crate version. * Add a currently empty readme.	2023-08-15 10:48:57 +01:00
Laurent Mazare	90374097dc	Cudnn support (#445 ) * Add a cudnn feature to be used for conv2d. * Allocate the proper workspace. * Only create a single cudnn handle per cuda device. * Proper cudnn usage. * Bugfix.	2023-08-14 21:30:41 +01:00
Nicolas Patry	dece0b8a76	Merge pull request #263 from huggingface/book_3 Book 3 (advanced loading + hub)	2023-08-09 16:50:11 +02:00
Laurent Mazare	3a62aee91f	Write the generated images using the image crate. (#363 ) * Use the image crate to write the generated images. * Make the dependency optional.	2023-08-09 15:26:44 +01:00
Laurent Mazare	b278834267	Support the Accelerate BLAS on macOS. (#325 ) * Add the accelerate feature. * Ffi tweaks.	2023-08-05 17:25:24 +01:00
Laurent Mazare	620f83cf66	Add the candle-datasets crate (#322 ) * Move the vision datasets to a separate crate. * Move the batcher bits. * Update the readme. * Move the tiny-stories bits. --------- Co-authored-by: Jane Doe <jane.doe@example.org>	2023-08-05 08:56:50 +01:00
Nicolas Patry	c11e78b334	Odd rebase artifact.	2023-08-02 18:40:24 +02:00
Nicolas Patry	1b705a426f	Remove duplicate.	2023-08-02 18:40:24 +02:00
Nicolas Patry	a44471a305	Adding more details on how to load things. - Loading with memmap - Loading a sharded tensor - Moved some snippets to `candle-examples/src/lib.rs` This is because managing book specific dependencies is a pain https://github.com/rust-lang/mdBook/issues/706 - This causes a non aligned inclusion https://github.com/rust-lang/mdBook/pull/1856 which we have to ignore fmt to remove. mdbook might need some more love :)	2023-08-02 18:40:24 +02:00
Laurent Mazare	4fe8a02f88	Update the repo location. (#305 )	2023-08-02 11:12:18 +01:00
Laurent Mazare	d38943aadc	Add version numbers for all the candle crates (#303 ) * Switch to candle-gemm for the time being. * Add the missing versions.	2023-08-02 10:52:13 +01:00
Laurent Mazare	51e51da896	Rename the candle crate to candle-core (#301 ) * Rename to candle-core. * More candle-core renaming.	2023-08-02 08:20:22 +01:00
Laurent Mazare	a27239f3d9	Add training for the llama2.c example (#296 ) * Rework the commands and run inference by default. * Add the training module and load the training dataset. * Random dataset iterator. * Proper valid-loss computation. * Compute the evaluation loss. * Add more substance to the training loop.	2023-08-01 17:23:07 +01:00
Laurent Mazare	38ff693af0	Add a flag to save the trained weights. (#279 )	2023-07-30 15:41:42 +01:00
Nicolas Patry	97181a77c0	Making multiprocess require flash-attn.	2023-07-28 23:31:24 +02:00
Nicolas Patry	4002968cf5	Put back `"dep:half"	2023-07-28 10:34:21 +00:00
Nicolas Patry	be256a6ba6	Fixing.	2023-07-28 10:23:05 +00:00
Nicolas Patry	d2dea11ef6	Fixing nccl feature.	2023-07-28 12:19:20 +02:00
Nicolas Patry	1735e4831e	TP sharding v2	2023-07-27 09:58:14 +02:00
Laurent Mazare	d9f9c859af	Add flash attention (#241 ) * Add some flash-attn kernel, import the code for flash-attn v2 from Dao-AILab. * More flash attn. * Set up the flash attn parameters. * Get things to compile locally. * Move the flash attention files in a different directory. * Build the static C library with nvcc. * Add more flash attention. * Update the build part. * Better caching. * Exclude flash attention from the default workspace. * Put flash-attn behind a feature gate. * Get the flash attn kernel to run. * Move the flags to a more appropriate place. * Enable flash attention in llama. * Use flash attention in llama.	2023-07-26 07:48:10 +01:00
Laurent Mazare	35b65fed88	Add llama2.c as an example. (#229 ) * Start adding llama2.c. * Model loading. * Add the llama-v2 model. * Start converting the weights. * Rotary embedding tweaks. * Get the model to generate some tokens.	2023-07-24 09:13:50 +01:00
Laurent Mazare	b8a10425ad	Kernel build example (#224 ) * Build example kernels. * Add some sample custom kernel. * Get the example kernel to compile. * Add some cuda code. * More cuda custom op. * More cuda custom ops.	2023-07-23 07:15:37 +01:00
Nicolas Patry	439321745a	Removing `candle-hub` internal to extract into `hf-hub` standalone.	2023-07-19 15:04:38 +02:00
Laurent Mazare	b8abe2bb4b	Factorize the tokenizers version in the workspace cargo def. (#186 )	2023-07-18 06:48:13 +01:00
Laurent Mazare	f0cccd08f0	Bert tracing (#184 ) * Add some tracing to bert. * More tracing. * Add a flag for tracing.	2023-07-17 19:40:42 +01:00
Laurent Mazare	104f89df31	Centralize the dependency versions and inherit them. (#177 )	2023-07-16 07:47:17 +01:00
Nicolas Patry	4ed56d7861	Removing cuda default. Seems very important for a lot of exploring users usually on laptop without GPUs. Adding more README instructions in a follow up.	2023-07-14 16:52:15 +02:00
Laurent Mazare	ba35d895e7	Sketch the candle-transformers crate. (#147 ) * Sketch the candle-transformers crate. * Format the empty files.	2023-07-12 13:49:31 +01:00
Laurent Mazare	9ce0f1c010	Sketch the candle-nn crate. (#115 ) * Sketch the candle-nn crate. * Tweak the cuda dependencies. * More cuda tweaks.	2023-07-10 08:50:09 +01:00
Nicolas Patry	115629fe08	Creating new sync Api for `candle-hub`. - `api::Api` -> `api::tokio::api` (And created new `api::sync::Api`). - Remove `tokio` from all our examples. - Using similar codebase for now instead of ureq (for simplicity).	2023-07-06 15:15:25 +02:00
Laurent Mazare	c297a50960	Add mkl support for matrix multiply. (#86 ) * Fix some rebase issues. * Use mkl instead. * Use mkl in bert. * Add the optional mkl feature. * Conditional compilation based on the mkl feature. * Add more mkl support.	2023-07-06 11:05:05 +01:00
Laurent Mazare	93896f6596	Merge branch 'main' into upgrade_bert	2023-07-05 13:06:33 +01:00
laurent	95f378ebb4	Read wav files.	2023-07-05 11:53:58 +01:00
laurent	7a6bc6d2dc	Mel spectogram computation (fft bits).	2023-07-05 09:54:12 +01:00
Nicolas Patry	43a007cba4	Upgrading bert example to work with `bert-base-uncased`. - Always take weights from the hub - Optional `model_id` + `revision` to use safetensors version potentially - Optional loading for `bert-base-uncased` (`weight` vs `gamma`). - Take the config from the hub.	2023-07-04 14:12:14 +00:00
laurent	cb03364718	Fix the CI.	2023-07-03 11:34:02 +01:00
laurent	fdb1acd2ff	Move llama in a cargo-examples directory.	2023-07-03 11:30:58 +01:00

44 Commits