candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 18:48:51 +00:00

Author	SHA1	Message	Date
laurent	a910ec5993	CustomOp for einsum.	2023-09-08 20:46:30 +01:00
Laurent Mazare	acf8f10ae1	Get the comparison operation to work on scalar values. (#780 ) * Get the comparison operation to work on scalar values. * Add some time measurement.	2023-09-08 20:13:29 +01:00
Laurent Mazare	0906acab91	Automatic mask generation (#779 ) * A few more contiguous fixes for cuda. * Mask generation. * Generic bbox. * Generate all the masks.	2023-09-08 19:11:34 +01:00
Laurent Mazare	158ff3c609	Add tracing to segment-anything (#777 ) * Tracing support for segment-anything. * More tracing. * Handle the empty slice case.	2023-09-08 15:31:29 +01:00
Laurent Mazare	e5703d2f56	Draw the mask on a merged image. (#775 ) * Draw the mask on a merged image. * Clippy fix. * Enable the target point by default. * Add to the readme.	2023-09-08 14:04:34 +01:00
zmlcc	98172d46fa	Fix some errors about BlockQ8_1 (#776 ) * use int8 type instead of uint8 for BlockQ8_1.qs The uint8 type of BlockQ8_1.qs causes great loss for negative weights Ref: `ebc96086af/ggml.c (L904)` Signed-off-by: Zhang Miaolei <zmlcc@outlook.com> * fix sum error in vec_dot of BlockQ4_1 Ref: `ebc96086af/ggml.c (L2840)` Signed-off-by: Zhang Miaolei <zmlcc@outlook.com> * fix sum error in vec_dot of BlockQ5_1 Ref: `ebc96086af/ggml.c (L3490)` Signed-off-by: Zhang Miaolei <zmlcc@outlook.com> --------- Signed-off-by: Zhang Miaolei <zmlcc@outlook.com>	2023-09-08 13:29:40 +01:00
Laurent Mazare	28c87f6a34	Automatic mask generator + point base mask (#773 ) * Add more to the automatic mask generator. * Add the target point. * Fix. * Remove the allow-unused. * Mask post-processing.	2023-09-08 12:26:56 +01:00
Laurent Mazare	c1453f00b1	Improve the safetensor loading in the segment-anything example. (#772 ) * Improve the safetensor loading in the segment-anything example. * Properly handle the labels when embedding the point prompts.	2023-09-08 09:39:10 +01:00
Laurent Mazare	989a4807b1	Use shape with holes. (#771 )	2023-09-08 08:50:27 +01:00
Laurent Mazare	0e250aee4f	Shape with holes (#770 ) * Shape with holes. * rustfmt.	2023-09-08 08:38:13 +01:00
Zsombor	cfcbec9fc7	Add small customization to the build (#768 ) * Add ability to override the compiler used by NVCC from an environment variable * Allow relative paths in CANDLE_FLASH_ATTN_BUILD_DIR * Add the compilation failure to the readme, with a possible solution * Adjust the error message, and remove the special handling of the relative paths	2023-09-08 08:15:14 +01:00
Laurent Mazare	3898e500de	Generate a mask image + the scaled input image. (#769 ) * Also round-trip the original image. * Make it possible to use a safetensors input.	2023-09-08 05:53:08 +01:00
Laurent Mazare	79c27fc489	Segment-anything fixes: avoid normalizing twice. (#767 ) * Segment-anything fixes: avoid normalizing twice. * More fixes for the image aspect ratio.	2023-09-07 21:45:16 +01:00
Laurent Mazare	7396b8ed1a	Segment Anything - process images (#766 ) * Start processing images. * Add LayerNorm2d. * Properly use LayerNorm2d. * Tweak eps. * Use LayerNorm on inputs with a rank different from 3. * Window partitioning. * Fix a couple todos. * More todos. * Hard-code the einsums. * More padding support. * Some sizes tweaks. * Use the hub to get the weights. * Use a batch matmul. * Tweaks. * More fixes. * Get some predictions to be generated.	2023-09-07 19:22:45 +01:00
Laurent Mazare	7b50f3e106	More segment-anything again. (#764 ) * More segment-anything again. * Transformer block forward. * Two-ways transformer. * Position embeddings. * Sketch the prompt encoder. * More prompt-encoder. * More prompt-encoder. * Add the main sam module. * Embed the transformer. * And hook the transformer forward step. * Build the model. * Handle the global attn indexes. * Get the model to load.	2023-09-07 12:06:55 +01:00
Laurent Mazare	8c991df394	More segment-anything. (#763 ) * More segment-anything. * Split the model in multiple files. * Start adding the transformer. * Add the attention block. * Move the MLP Block.	2023-09-07 07:28:30 +01:00
Laurent Mazare	000fa00e31	Expose the conv2d-transpose layers. (#761 )	2023-09-07 06:04:52 +01:00
Laurent Mazare	a17a7c42c1	Add a nn layer for conv-transpose2d. (#760 )	2023-09-07 05:47:28 +01:00
Laurent Mazare	6527ab81a3	Sketch the segment anything model. (#759 ) * Sketch the segment anything model. * Fix some clippy lint. * Add the mask decoder.	2023-09-07 05:34:05 +01:00
Laurent Mazare	7b1f2da828	Cudnn fix. (#758 )	2023-09-06 17:39:39 +01:00
Laurent Mazare	bdc9d46fe3	Use an arc in the varbuilder rather than rc. (#757 ) * Use an arc in the varbuilder rather than rc. * Require the backends to be send. * Request send and sync.	2023-09-06 15:29:09 +01:00
Laurent Mazare	dcf708559d	Fix for cudnn to work with img2img. (#753 )	2023-09-06 07:49:28 +01:00
Laurent Mazare	7299a68353	img2img pipeline for stable diffusion. (#752 ) * img2img pipeline for stable diffusion. * Rename the arguments + fix. * Fix for zero strength. * Another fix. * Another fix. * Revert. * Include the backtrace. * Noise scaling. * Fix the height/width.	2023-09-06 07:06:49 +01:00
Radamés Ajna	16bf44f6e9	force model cache (#751 )	2023-09-06 05:53:31 +02:00
Laurent Mazare	a4f40f3dc8	Use rayon directly rather than constraining the number of threads. (#749 )	2023-09-05 20:26:15 +01:00
Radamés Ajna	6a40decc76	Minor WASM UI improvements (#748 ) * add stats * random seed btn * minor ui improvoments	2023-09-05 19:24:43 +01:00
Laurent Mazare	a0d65585db	Softmax implementation for cuda. (#747 )	2023-09-05 18:38:03 +01:00
Laurent Mazare	94c6a8d3d3	Add a dedicated cuda kernel for softmax. (#746 )	2023-09-05 17:53:20 +02:00
Laurent Mazare	6615daf242	Tweaks to softmax. (#745 )	2023-09-05 15:22:27 +01:00
Laurent Mazare	1c9e5394a5	Add a custom softmax implementation. (#744 ) * Add a custom softmax implementation. * Add softmaxlastdim to the benchmarks. * And add a test. * Support more dtypes. * Polish the code. * Use the slow implementation on cuda. * Add a todo for the cuda kernel.	2023-09-05 14:20:23 +01:00
Laurent Mazare	a8410bf35e	Add some documentation. (#743 )	2023-09-05 09:51:12 +01:00
Gonzalo	cda45a7443	Let outside CustomOp2 implementations use binary_map/binary_map_vec (#741 )	2023-09-05 09:27:32 +01:00
Masato Mori	4698eb5cb6	Fix typo in the nll function document (#742 )	2023-09-05 09:25:11 +01:00
Laurent Mazare	000487c36f	Add a python function to save as safetensors. (#740 )	2023-09-04 20:32:14 +01:00
Laurent Mazare	ab0d9fbdd1	Properly set the is_bf16 flag. (#738 )	2023-09-04 16:45:26 +01:00
Laurent Mazare	f80fd44201	BF16 support for flash-attn. (#737 )	2023-09-04 16:35:43 +01:00
Laurent Mazare	0d00c06a83	Fix clippy lint. (#736 )	2023-09-04 16:09:19 +01:00
Radamés Ajna	8395152d20	Llama2c WASM UI improvements (#732 ) * pass seed, expose model seq_len * wip new llama2.c ui * final new UI example * small coppy * copy	2023-09-04 15:59:22 +01:00
Laurent Mazare	e2f9f60ac2	Avoid some redundant clone. (#731 )	2023-09-04 09:18:32 +02:00
Laurent Mazare	d0cdea95a5	Add back the bf16 flash-attn kernels. (#730 )	2023-09-04 07:50:52 +01:00
Laurent Mazare	20512ba408	Return the metadata in the gguf pyo3 bindings. (#729 ) * Return the metadata in the gguf pyo3 bindings. * Read the metadata in the quantized llama example. * Get inference to work on gguf files.	2023-09-04 07:07:00 +01:00
Laurent Mazare	9c61b0fc9b	Proper log buckets for t5. (#727 ) * Proper log buckets for t5. * Properly pass the position bias.	2023-09-03 20:33:50 +01:00
Laurent Mazare	26cd266e65	Musicgen text embeddings. (#726 ) * Musicgen text embeddings. * Bugfix for layer norm. * Proper position bias. * Expose the weights.	2023-09-03 18:27:48 +01:00
Laurent Mazare	bbec527bb9	Fix the musicgen example. (#724 ) * Fix the musicgen example. * Retrieve the weights from the hub.	2023-09-03 14:50:39 +01:00
Lukas Kreussel	f7980e07e0	Add `ggufv2` support (#725 )	2023-09-03 14:41:57 +01:00
Laurent Mazare	74a82c358a	Add the mse loss. (#723 )	2023-09-03 10:51:40 +01:00
Laurent Mazare	84d003ff53	Handle arbitrary shapes in Tensor::new. (#718 )	2023-09-02 19:59:21 +01:00
Laurent Mazare	21109e1983	Recommend using maturin. (#717 )	2023-09-02 16:19:35 +01:00
Laurent Mazare	ad796eb4be	More quantized llama in python. (#716 ) * More quantized llama in python. * Expose a couple more functions. * Apply the last layer. * Use the vocab from the ggml files.	2023-09-02 13:41:48 +01:00
Laurent Mazare	e8e33752f4	Sketch a quantized llama using the pyo3 api. (#715 ) * Sketch a quantized llama using the pyo3 api. * Add more ops. * Expose a few more functions to use in the quantized model. * Rope embeddings. * Get the forward pass to work.	2023-09-02 11:26:05 +01:00

1 2 3 4 5 ...

1130 Commits