candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 18:48:51 +00:00

Author	SHA1	Message	Date
Laurent Mazare	6e485f2deb	Add some optional repeat penalty. (#623 ) * Add some optional repeat penalty. * Add the missing files.	2023-08-27 10:48:45 +01:00
Nicolas Patry	aa67e5107d	Merge pull request #600 from huggingface/codellama_gpu_support Adding support for codellama in examples.	2023-08-25 18:25:26 +02:00
Nicolas Patry	c105550405	s/panic/bail/	2023-08-25 18:05:07 +02:00
Laurent Mazare	ca6c050b04	Cleanup the pose reporting code. (#605 )	2023-08-25 16:49:21 +01:00
Laurent Mazare	0afbc435df	Add some configurable legend for yolo detection. (#603 ) * Add some configurable legend for yolo detection. * Clippyness.	2023-08-25 13:50:31 +01:00
Laurent Mazare	97909e5068	Move the yolo model bits in a separate file. (#602 ) * Move the yolo model bits in a separate file. * Improve the drawing. * Bugfix.	2023-08-25 12:47:55 +01:00
Laurent Mazare	8bc5fffa45	More support for pose estimation in yolo-v8. (#599 ) * More support for pose estimation in yolo-v8. * Support both object detection and pose-estimation in the yolo-v8 example.	2023-08-25 11:21:11 +01:00
Nicolas Patry	4826a4212e	Adding support for codellama in examples. Codellama requires bf16 for now (error to convert from bf16 to f16). Multiprocess demo not functional for it because flash-attn only supports f16 for now.	2023-08-25 09:56:11 +00:00
Laurent Mazare	c093b03d51	Generic implementation of vecdot for q80. (#596 ) * Generic implementation of vecdot for q80. * Add support for code-llama 7b. * Support more code-llama.	2023-08-25 09:04:05 +01:00
Laurent Mazare	189442a0fa	Add the pose estimation head for yolo. (#589 ) * Add the pose estimation head for yolo. * Properly handle the added position dimensions. * Integrate the pose estimation head in the forward pass. * Renaming. * Fix for pose estimation.	2023-08-24 22:12:34 +01:00
Laurent Mazare	79916c2edb	Use the hub weights for efficientnet. (#573 )	2023-08-23 18:20:21 +01:00
Laurent Mazare	431051cc32	Add Efficientnet (#572 ) * EfficientNet. * Complete the efficientnet implementation. * Improve group handling. * Get the efficientnet to work.	2023-08-23 18:02:58 +01:00
Laurent Mazare	eedd85ffa7	Move the imagenet specific bits to a separate file. (#571 )	2023-08-23 16:42:09 +01:00
Laurent Mazare	329f661d9b	Trace softmax (#568 ) * Trace the softmax op. * Inline the sum. * Add min/max vec operations.	2023-08-23 15:25:50 +01:00
Laurent Mazare	aba1e90797	Add some group parameter to convolutions. (#566 ) * Add some group parameter to convolutions. * Avoid some unnecessary groups checks. * Move the tensor convolution bits. * Properh handling of groups. * Bump the crate version. * And add a changelog.	2023-08-23 12:58:55 +01:00
Laurent Mazare	4ee1cf038a	Get the rms epsilon from GGUF. (#565 )	2023-08-23 11:40:20 +01:00
Laurent Mazare	0f4ff8a739	Fix the quantized example. (#564 )	2023-08-23 11:09:55 +01:00
cksac	89a00b56cc	add chat models in quantized example (#551 ) * add chat models in quantized example * cargo fmt	2023-08-23 11:05:33 +01:00
Laurent Mazare	508d34daf2	GGUF support in the quantized model. (#559 ) * GGUF support in the quantized model. * Get the GGUF support to work on llama.	2023-08-23 09:20:57 +01:00
Laurent Mazare	f9ecc84477	GQA support in the quantized model. (#555 ) * GQA support in the quantized model. * Fix the reshaping. * Fix the main llama model. * Infer the proper gqa from the model kind.	2023-08-22 19:41:10 +01:00
Laurent Mazare	cc22d4db20	Put the transcribe token before the language one. (#553 )	2023-08-22 16:46:34 +01:00
Laurent Mazare	9bc811a247	Improve the aspect ratio handling on yolo-v8. (#549 ) * Fix the aspect ratio handling in yolo-v8. * Typo.	2023-08-22 14:55:33 +01:00
Laurent Mazare	bb69d89e28	Move the yolo shared bits to a common place. (#548 ) * Move the yolo shared bits to a common place. * Share more code. * Configurable thresholds.	2023-08-22 13:03:07 +01:00
Laurent Mazare	20ce3e9f39	Sketch the yolo wasm example. (#546 ) * Sketch the yolo wasm example. * Web ui. * Get the web ui to work. * UI tweaks. * More UI tweaks. * Use the natural width/height. * Add a link to the hf space in the readme.	2023-08-22 11:56:43 +01:00
Laurent Mazare	44420d8ae1	Add some llama-v2 variants. (#545 )	2023-08-22 08:35:15 +01:00
Laurent Mazare	f16bb97401	Use the yolo-v8 weights from the hub. (#544 ) * Use the weights from the hub. * Add to the readme.	2023-08-21 22:07:36 +01:00
Laurent Mazare	3507e14c0c	Yolo v8 fixes (#542 ) * Fixes for the yolo-v8 layout. * Bugfixes. * Another silly bugfix. * Remove the hf-hub dependency. * Remove the transformers dependency.	2023-08-21 21:05:40 +01:00
Laurent Mazare	de50e66af1	Add yolo v8 as an example (#541 ) * Sketching yolo-v8. * Get the model to load. * yolo-v8 forward pass. * Complete(?) the forward pass. * Fix some shape issues. * Add the missing padding. * Process the predictions.	2023-08-21 18:40:09 +01:00
Laurent Mazare	cc2d6cf2e0	Improve the timestamps support in whisper (#539 ) * Timestamp support for whisper. * Properly display the timestamps. * Bugfix for the timestamp units.	2023-08-21 12:26:59 +01:00
Laurent Mazare	e3b71851e6	Retrieve the yolo-v3 weights from the hub. (#537 )	2023-08-21 10:55:09 +01:00
Laurent Mazare	4300864ce9	Add some optional repeat penalty. (#535 )	2023-08-21 09:59:13 +01:00
Laurent Mazare	11c7e7bd67	Some fixes for yolo-v3. (#529 ) * Some fixes for yolo-v3. * Use the running stats for inference in the batch-norm layer. * Get some proper predictions for yolo. * Avoid the quadratic insertion.	2023-08-20 23:19:15 +01:00
Laurent Mazare	a1812f934f	Add a yolo-v3 example. (#528 ) * Add a couple functions required for yolo. * Add the yolo-v3 example. * Add minimum and maximum. * Use the newly introduced maximum. * Cuda support for min/max + add some testing. * Allow for more tests to work with accelerate. * Fix a typo.	2023-08-20 18:19:37 +01:00
Laurent Mazare	a8f61e66cc	Bump the crates version to 0.1.2. (#522 )	2023-08-20 08:07:07 +01:00
Laurent Mazare	aa207f2dd9	Print some per-step timings in stable-diffusion. (#520 ) * Skeleton files for neon support of quantization. * SIMD version for q4 vecdot. * Also simdify the q6k multiplication. * Add some timings to stable-diffusion.	2023-08-20 05:45:12 +01:00
Laurent Mazare	d73ca3d28e	Line up the llama.cpp implementation with the candle one. (#518 ) * Separate the prompt stats from the post-prompt ones in the quantized example. * Slightly nicer output printing. * Line up with the llama.cpp implementation.	2023-08-19 20:12:07 +01:00
Laurent Mazare	b64e782c2d	Use the hub to retrieve dinov2 model weights. (#507 )	2023-08-18 18:27:31 +01:00
Laurent Mazare	e5dd5fd1b3	Print the recognized categories in dino-v2. (#506 )	2023-08-18 17:32:58 +01:00
Laurent Mazare	cb069d6063	Add the permute op (similar to pytorch). (#504 ) * Add the permute op (similar to pytorch). * Add the backprop for dimension permutation.	2023-08-18 16:30:53 +01:00
Laurent Mazare	4f1541526c	dinov2 - read images from disk and compute the class probabilities (#503 ) * Load the image from disk and convert it to a tensor. * Tweak the function name.	2023-08-18 15:50:33 +01:00
Laurent Mazare	95462c6a2e	Add a vision transformer example (dino-v2). (#502 ) * Add a vision transformer example (dino-v2). * Add some documentation + test. * CI fix. * Another fix (still unable to replicate the errors locally :( )	2023-08-18 11:58:06 +01:00
Laurent Mazare	b9661a1c25	Enable the image crate by default in examples (#501 ) * Enable the image crate by default so that it's easier to compile the stable diffusion example. * Also update the readme.	2023-08-18 10:00:05 +01:00
Laurent Mazare	c78ce76501	Add a simple Module trait and implement it for the various nn layers (#500 ) * Start adding the module trait. * Use the module trait. * Implement module for qmatmul.	2023-08-18 09:38:22 +01:00
Laurent Mazare	13401df4d1	Add an abstract type for RmsNorm. (#499 )	2023-08-18 08:52:14 +01:00
Laurent Mazare	26fd37b348	Use the main branch of the HF repo where possible. (#498 ) * Use the main branch of the HF repo where possible. * And add the large model.	2023-08-18 08:18:30 +01:00
Franco Lucchini	f056dcab21	Add medium model (#497 )	2023-08-18 08:08:59 +01:00
Laurent Mazare	557b2c28dd	Q6K quantization (#495 ) * Print the detected arch options. * Add the q6k quantization. * Add a currently broken test. * Bugfix. * Bugfix. * Another bugfix. * Another bugfix + get the test to work.	2023-08-17 22:22:57 +01:00
Laurent Mazare	3164cd24fa	Replicate the sot-token logic from the Python implementation more acc… (#491 ) * Replicate the sot-token logic from the Python implementation more accurately. * Add a flag to control the timestamp mode.	2023-08-17 16:59:36 +01:00
Laurent Mazare	5f30c1e1e0	Add the whisper small model. (#490 )	2023-08-17 15:48:34 +01:00
Laurent Mazare	ad7c53953b	Add a verbose-prompt mode, similar to llama.cpp. (#489 )	2023-08-17 15:26:44 +01:00

... 9 10 11 12 13 ...

795 Commits