candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	3cd7e7b51d	Fuse the rel-pos additions via a custom-op. (#786 ) * Fuse the rel-pos additions via a custom-op. * Run with rayon. * Add more tracing.	2023-09-09 10:46:09 +01:00
Laurent Mazare	acf8f10ae1	Get the comparison operation to work on scalar values. (#780 ) * Get the comparison operation to work on scalar values. * Add some time measurement.	2023-09-08 20:13:29 +01:00
Laurent Mazare	0906acab91	Automatic mask generation (#779 ) * A few more contiguous fixes for cuda. * Mask generation. * Generic bbox. * Generate all the masks.	2023-09-08 19:11:34 +01:00
Laurent Mazare	158ff3c609	Add tracing to segment-anything (#777 ) * Tracing support for segment-anything. * More tracing. * Handle the empty slice case.	2023-09-08 15:31:29 +01:00
Laurent Mazare	e5703d2f56	Draw the mask on a merged image. (#775 ) * Draw the mask on a merged image. * Clippy fix. * Enable the target point by default. * Add to the readme.	2023-09-08 14:04:34 +01:00
Laurent Mazare	28c87f6a34	Automatic mask generator + point base mask (#773 ) * Add more to the automatic mask generator. * Add the target point. * Fix. * Remove the allow-unused. * Mask post-processing.	2023-09-08 12:26:56 +01:00
Laurent Mazare	c1453f00b1	Improve the safetensor loading in the segment-anything example. (#772 ) * Improve the safetensor loading in the segment-anything example. * Properly handle the labels when embedding the point prompts.	2023-09-08 09:39:10 +01:00
Laurent Mazare	989a4807b1	Use shape with holes. (#771 )	2023-09-08 08:50:27 +01:00
Laurent Mazare	3898e500de	Generate a mask image + the scaled input image. (#769 ) * Also round-trip the original image. * Make it possible to use a safetensors input.	2023-09-08 05:53:08 +01:00
Laurent Mazare	79c27fc489	Segment-anything fixes: avoid normalizing twice. (#767 ) * Segment-anything fixes: avoid normalizing twice. * More fixes for the image aspect ratio.	2023-09-07 21:45:16 +01:00
Laurent Mazare	7396b8ed1a	Segment Anything - process images (#766 ) * Start processing images. * Add LayerNorm2d. * Properly use LayerNorm2d. * Tweak eps. * Use LayerNorm on inputs with a rank different from 3. * Window partitioning. * Fix a couple todos. * More todos. * Hard-code the einsums. * More padding support. * Some sizes tweaks. * Use the hub to get the weights. * Use a batch matmul. * Tweaks. * More fixes. * Get some predictions to be generated.	2023-09-07 19:22:45 +01:00
Laurent Mazare	7b50f3e106	More segment-anything again. (#764 ) * More segment-anything again. * Transformer block forward. * Two-ways transformer. * Position embeddings. * Sketch the prompt encoder. * More prompt-encoder. * More prompt-encoder. * Add the main sam module. * Embed the transformer. * And hook the transformer forward step. * Build the model. * Handle the global attn indexes. * Get the model to load.	2023-09-07 12:06:55 +01:00
Laurent Mazare	8c991df394	More segment-anything. (#763 ) * More segment-anything. * Split the model in multiple files. * Start adding the transformer. * Add the attention block. * Move the MLP Block.	2023-09-07 07:28:30 +01:00
Laurent Mazare	6527ab81a3	Sketch the segment anything model. (#759 ) * Sketch the segment anything model. * Fix some clippy lint. * Add the mask decoder.	2023-09-07 05:34:05 +01:00
Laurent Mazare	dcf708559d	Fix for cudnn to work with img2img. (#753 )	2023-09-06 07:49:28 +01:00
Laurent Mazare	7299a68353	img2img pipeline for stable diffusion. (#752 ) * img2img pipeline for stable diffusion. * Rename the arguments + fix. * Fix for zero strength. * Another fix. * Another fix. * Revert. * Include the backtrace. * Noise scaling. * Fix the height/width.	2023-09-06 07:06:49 +01:00
Laurent Mazare	1c9e5394a5	Add a custom softmax implementation. (#744 ) * Add a custom softmax implementation. * Add softmaxlastdim to the benchmarks. * And add a test. * Support more dtypes. * Polish the code. * Use the slow implementation on cuda. * Add a todo for the cuda kernel.	2023-09-05 14:20:23 +01:00
Laurent Mazare	9c61b0fc9b	Proper log buckets for t5. (#727 ) * Proper log buckets for t5. * Properly pass the position bias.	2023-09-03 20:33:50 +01:00
Laurent Mazare	26cd266e65	Musicgen text embeddings. (#726 ) * Musicgen text embeddings. * Bugfix for layer norm. * Proper position bias. * Expose the weights.	2023-09-03 18:27:48 +01:00
Laurent Mazare	bbec527bb9	Fix the musicgen example. (#724 ) * Fix the musicgen example. * Retrieve the weights from the hub.	2023-09-03 14:50:39 +01:00
Laurent Mazare	2c1df6bba1	Add a repeat penality to the llama2-c command line example. (#713 ) * Add a repeat penality to the llama2-c command line example. * Another fix attempt.	2023-09-01 20:38:58 +01:00
Laurent Mazare	19042962d5	Whisper fix (#711 ) * Remove unnecessary file. * Whisper fix.	2023-09-01 20:04:07 +01:00
Laurent Mazare	7529531056	Add the optimizer trait. (#702 )	2023-09-01 12:55:39 +01:00
Laurent Mazare	7cef35c84d	Tweak some quantized args (#692 ) * Print the args + change the default temp/repeat penalty. * Minor formatting tweak.	2023-08-31 17:25:21 +01:00
Laurent Mazare	7509c98970	Interactive mode for the quantized model. (#690 )	2023-08-31 10:52:42 +01:00
Laurent Mazare	9874d843f1	Fix the accelerate build (#678 ) * Cosmetic changes. * Fix the accelerate build for tanh.	2023-08-30 18:31:14 +02:00
Laurent Mazare	7d753d3acd	Mnist training dropout (#677 ) * Use dropout in the mnist training. * Fix.	2023-08-30 16:41:01 +01:00
Laurent Mazare	618f4e4c78	Add some documentation. (#673 ) * Add some documentation. * Bump the crate version.	2023-08-30 11:54:00 +01:00
Laurent Mazare	a1a5ab8b0a	Neon optimized vecdot (#666 ) * Q5k vecdot. * Add the q3k vecdot. * Q2k vecdot. * Move the quantized model to its own file.	2023-08-29 22:28:46 +01:00
Laurent Mazare	2d3fcad267	Simplify usage of the pool functions. (#662 ) * Simplify usage of the pool functions. * Small tweak. * Attempt at using apply to simplify the convnet definition.	2023-08-29 19:12:16 +01:00
Laurent Mazare	b31d41e26a	Add a convnet training example. (#661 ) * Add a convnet example. * Dataset fix. * Randomize batches.	2023-08-29 18:23:01 +01:00
Laurent Mazare	a044907ffc	Dilated convolutions (#657 ) * Add the dilation parameter. * Restore the basic optimizer example. * Dilation support in cudnn. * Use the dilation parameter in the cpu backend. * More dilation support. * No support for dilation in transposed convolutions. * Add dilation to a test. * Remove a print. * Helper function.	2023-08-29 16:12:11 +01:00
Nicolas Patry	1aca6fa291	Upgrading hf-hub.	2023-08-29 14:18:54 +02:00
Nicolas Patry	14b4d456e8	Merge pull request #439 from huggingface/training_hub_dataset [Book] Add small error management + start training (with generic dataset inclusion).	2023-08-29 13:10:05 +02:00
Laurent Mazare	62ef494dc1	Use multiple transformer layer in the same cross-attn blocks. (#653 ) * Use multiple transformer layer in the same cross-attn blocks. * Make the context contiguous if required.	2023-08-29 11:13:43 +01:00
Laurent Mazare	33c23c19b6	Preliminary support for SDXL. (#647 ) * Preliminary support for SDXL. * More SDXL support. * More SDXL. * Use the proper clip config. * Querying for existing tensors. * More robust test.	2023-08-29 09:00:04 +01:00
Nicolas Patry	d726484a6d	Re-enable local dir for mnist.	2023-08-28 15:15:27 +02:00
Nicolas Patry	d7a273be51	Training: - Removed a lot of surface (SerializedFileReader ownership is really painful). - Moved example + vision to hf.co version. - Removed feature gate.	2023-08-28 15:15:01 +02:00
Laurent Mazare	26e1b40992	Repeat-penalty in the falcon example. (#634 )	2023-08-28 08:13:40 +01:00
Laurent Mazare	72ebb12bca	Remove some dead-code annotations. (#629 ) * Remove some dead-code annotations. * More dead code removal. * One more. * CI fix.	2023-08-27 18:52:33 +01:00
Laurent Mazare	4c338b0cd9	VarBuilder cleanup (#627 ) * VarBuilder cleanup. * Implement the basic varbuilders. * Add the sharded code. * Proper support for tensor sharding.	2023-08-27 18:03:26 +01:00
Laurent Mazare	6e485f2deb	Add some optional repeat penalty. (#623 ) * Add some optional repeat penalty. * Add the missing files.	2023-08-27 10:48:45 +01:00
Nicolas Patry	aa67e5107d	Merge pull request #600 from huggingface/codellama_gpu_support Adding support for codellama in examples.	2023-08-25 18:25:26 +02:00
Nicolas Patry	c105550405	s/panic/bail/	2023-08-25 18:05:07 +02:00
Laurent Mazare	ca6c050b04	Cleanup the pose reporting code. (#605 )	2023-08-25 16:49:21 +01:00
Laurent Mazare	0afbc435df	Add some configurable legend for yolo detection. (#603 ) * Add some configurable legend for yolo detection. * Clippyness.	2023-08-25 13:50:31 +01:00
Laurent Mazare	97909e5068	Move the yolo model bits in a separate file. (#602 ) * Move the yolo model bits in a separate file. * Improve the drawing. * Bugfix.	2023-08-25 12:47:55 +01:00
Laurent Mazare	8bc5fffa45	More support for pose estimation in yolo-v8. (#599 ) * More support for pose estimation in yolo-v8. * Support both object detection and pose-estimation in the yolo-v8 example.	2023-08-25 11:21:11 +01:00
Nicolas Patry	4826a4212e	Adding support for codellama in examples. Codellama requires bf16 for now (error to convert from bf16 to f16). Multiprocess demo not functional for it because flash-attn only supports f16 for now.	2023-08-25 09:56:11 +00:00
Laurent Mazare	c093b03d51	Generic implementation of vecdot for q80. (#596 ) * Generic implementation of vecdot for q80. * Add support for code-llama 7b. * Support more code-llama.	2023-08-25 09:04:05 +01:00

1 2 3 4 5 ...

317 Commits