candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-15 18:28:24 +00:00

Author	SHA1	Message	Date
Laurent Mazare	a193bf5f60	Another gemm update. (#1088 )	2023-10-14 09:36:52 +01:00
Laurent Mazare	eefad2b95f	Update to gemm 0.16.1 (#1083 )	2023-10-13 06:40:20 +01:00
Laurent Mazare	5e6df4a3f7	Update to gemm-0.16. (#1082 ) * Update to gemm-0.16. * Enable wasm-simd128.	2023-10-12 21:56:59 +01:00
Laurent Mazare	096dee7073	Bump the version to 0.3.0. (#1014 ) * Bump the version to 0.3.0. * Changelog update.	2023-10-01 13:51:57 +01:00
Laurent Mazare	667f01c173	Simd128 vec-dot for q4_0. (#974 ) * Simd128 vec-dot for q4_0. * Bugfix. * Add wasm tests. * Bugfix for the q40 vecdot. * More quantization tests.	2023-09-27 14:15:30 +01:00
Radamés Ajna	29bd6b2979	Phi 1.5 wasm module (#966 ) * add phi wasm module * replace input with textarea * trim input prompt * stop on <\|endoftext\|> * formatting * clean up * add blurb, and syntax highlighting * add phi-v1.5 wasm * add note * hide Options on details * add first token to generated text * whitespaces for new line * fix: abort -> aborted	2023-09-27 06:07:11 +01:00
Laurent Mazare	ccf352f3d1	Use yoke to provide a self-referential container for mmaped safetenso… (#939 ) * Use yoke to provide a self-referential container for mmaped safetensor files. * Add the new self-owned type for safetensor files without removing the previous version. * Add routing. * Add an initializer for the case of multiple files.	2023-09-23 15:43:11 +01:00
Radamés Ajna	19e52e5007	T5 Wasm (#918 ) * init t5 wasm model * split workers for each model * clean up * add some ui * readme * index * typo * remove cache param, clear_kv_cache * add max_length as param * add model tasks option to ui * add method to load quantized gguf from buffer * Add quantized wasm module * add quantized models to UI, dynamic import wasms * link to quantized * fix copy * fix ModelEncoder * fix README.md	2023-09-22 15:31:10 +01:00
Radamés Ajna	7ad82b87e4	BERT Wasm (#902 ) * implement wasm module * add example to workspace * add UI explore semantic similiarity * change status messages * formatting * minor changes	2023-09-19 21:31:37 +01:00
Laurent Mazare	7dd8e12472	Bump the crate versions to v0.2.3. (#886 ) * Bump the crate version. * Also update the python bindings.	2023-09-18 12:14:03 +01:00
Laurent Mazare	ef8cd8fea0	Update the candle-gemm version. (#885 )	2023-09-18 09:36:20 +01:00
Laurent Mazare	2257f4d475	Bump the crate version + update the changelog. (#822 )	2023-09-12 06:39:24 +01:00
Laurent Mazare	584171cae1	Add a wasm module for the segment anything example. (#797 )	2023-09-10 12:29:37 +01:00
Laurent Mazare	618f4e4c78	Add some documentation. (#673 ) * Add some documentation. * Bump the crate version.	2023-08-30 11:54:00 +01:00
Nicolas Patry	4ed202447e	Upgrading hf-hub.	2023-08-29 14:14:26 +02:00
Nicolas Patry	dd06d93d0b	Cleanup: - Moved around book from `examples` to `candle-book` proper (overlapping the book and the lib structures)	2023-08-28 15:15:26 +02:00
Laurent Mazare	a3f97c143d	Bump the crate version + update CHANGELOG. (#628 )	2023-08-27 18:17:11 +01:00
Laurent Mazare	0afbc435df	Add some configurable legend for yolo detection. (#603 ) * Add some configurable legend for yolo detection. * Clippyness.	2023-08-25 13:50:31 +01:00
Laurent Mazare	97909e5068	Move the yolo model bits in a separate file. (#602 ) * Move the yolo model bits in a separate file. * Improve the drawing. * Bugfix.	2023-08-25 12:47:55 +01:00
Laurent Mazare	d728e646c2	Use resolver 2 explicitely. (#597 )	2023-08-25 09:35:40 +01:00
Laurent Mazare	aba1e90797	Add some group parameter to convolutions. (#566 ) * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.	2023-08-23 12:58:55 +01:00
Laurent Mazare	20ce3e9f39	Sketch the yolo wasm example. (#546 ) * Sketch the yolo wasm example. * Web ui. * Get the web ui to work. * UI tweaks. * More UI tweaks. * Use the natural width/height. * Add a link to the hf space in the readme.	2023-08-22 11:56:43 +01:00
Laurent Mazare	a8f61e66cc	Bump the crates version to 0.1.2. (#522 )	2023-08-20 08:07:07 +01:00
Laurent Mazare	531f23b4d0	Rename vec-dot to vec-ops. (#449 ) * Rename vec-dot to vec-ops. * Also bump the crate version. * Add a currently empty readme.	2023-08-15 10:48:57 +01:00
Laurent Mazare	495e0b7580	Simd support (#448 ) * Import the simd intrinsics in candle-core. * simd version of reduce-sum. * Bugfix. * Fix some clippy lints.	2023-08-15 09:50:38 +01:00
Laurent Mazare	c84883ecf2	Add a cuda kernel for upsampling. (#441 ) * Add a cuda kernel for upsampling. * Update for the latest tokenizers version.	2023-08-14 13:12:17 +01:00
Laurent Mazare	e29c7809ec	Parallelise the CPU kernels for the conv ops. (#401 ) * Parallelise the conv2d op. * Tighter control on threading. * Also parallelise conv1d. * Add some safety comment.	2023-08-11 05:51:58 +01:00
Nicolas Patry	379eadc68e	Working now.	2023-08-10 19:43:25 +02:00
Nicolas Patry	7e4fbc1e17	[DO NOT MERGE] temporary PR so users can try out on older GPUs.	2023-08-10 19:36:31 +02:00
Laurent Mazare	c8039579a5	Conv1d optimize (#392 ) * Reorder the conv1d loops in the cpu backend. * Optimize the 1d convolution. * Conv1D optimize. * Fix some clippy lints.	2023-08-10 15:23:52 +01:00
Lei	3bbc08a8df	Fix randn cpu (#382 ) * Change distributions Standard generates in [0, 1), Normal is correct. * Add test Not sure if this is the best place to put the test * Remove unnecessary use	2023-08-10 05:33:44 +01:00
Laurent Mazare	da26e2832c	Update gemm to 0.15.6. (#378 )	2023-08-09 21:04:28 +01:00
Laurent Mazare	3a62aee91f	Write the generated images using the image crate. (#363 ) * Use the image crate to write the generated images. * Make the dependency optional.	2023-08-09 15:26:44 +01:00
Laurent Mazare	e72ba0b9e7	Add the license files. (#335 )	2023-08-07 14:11:27 +01:00
Laurent Mazare	b278834267	Support the Accelerate BLAS on macOS. (#325 ) * Add the accelerate feature. * Ffi tweaks.	2023-08-05 17:25:24 +01:00
Laurent Mazare	620f83cf66	Add the candle-datasets crate (#322 ) * Move the vision datasets to a separate crate. * Move the batcher bits. * Update the readme. * Move the tiny-stories bits. --------- Co-authored-by: Jane Doe <jane.doe@example.org>	2023-08-05 08:56:50 +01:00
Laurent Mazare	4fe8a02f88	Update the repo location. (#305 )	2023-08-02 11:12:18 +01:00
Laurent Mazare	d38943aadc	Add version numbers for all the candle crates (#303 ) * Switch to candle-gemm for the time being. * Add the missing versions.	2023-08-02 10:52:13 +01:00
Laurent Mazare	6e33ff62d6	Update cudarc now that it includes the cublas-f16 and nccl changes. (#300 )	2023-08-02 05:54:28 +01:00
Nicolas Patry	d2dea11ef6	Fixing nccl feature.	2023-07-28 12:19:20 +02:00
Nicolas Patry	4f260ef025	Merge pull request #216 from LaurentMazare/llama_multiprocess2 TP sharding v2	2023-07-28 08:06:13 +01:00
Nicolas Patry	ca479a873e	Upgrading hf-hub to `0.2.0` (Modified API to not pass the Repo around all the time)	2023-07-27 20:05:02 +02:00
Nicolas Patry	b7814f66b4	PyO3 is back.	2023-07-27 09:58:47 +02:00
Nicolas Patry	ed58de7551	Fixed TP sharded version.	2023-07-27 09:58:46 +02:00
Nicolas Patry	1735e4831e	TP sharding v2	2023-07-27 09:58:14 +02:00
Laurent Mazare	6475bfadfe	Simplify Tensor::randn. (#255 ) * Simplify Tensor::randn. * Also switch Tensor::rand to use a generic dtype. * Support sampling for f16. * Cleanup.	2023-07-27 07:40:36 +01:00
Laurent Mazare	89fd988836	Update to the latest gemm. (#250 )	2023-07-26 17:00:02 +01:00
Laurent Mazare	d9f9c859af	Add flash attention (#241 ) * Add some flash-attn kernel, import the code for flash-attn v2 from Dao-AILab. * More flash attn. * Set up the flash attn parameters. * Get things to compile locally. * Move the flash attention files in a different directory. * Build the static C library with nvcc. * Add more flash attention. * Update the build part. * Better caching. * Exclude flash attention from the default workspace. * Put flash-attn behind a feature gate. * Get the flash attn kernel to run. * Move the flags to a more appropriate place. * Enable flash attention in llama. * Use flash attention in llama.	2023-07-26 07:48:10 +01:00
Laurent Mazare	5a26cba733	Re-organize the wasm examples (#231 ) * Move the whisper example. * More renaming. * Add llama2 as a new wasm example. * Live generation. * More of the llama wasm example. * Formatting.	2023-07-24 12:36:02 +01:00
Laurent Mazare	dc416243a3	Bump the hf-hub dependency to 0.1.3. (#206 )	2023-07-20 07:27:52 +01:00

1 2 3

131 Commits