candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Author	SHA1	Message	Date
Laurent Mazare	44420d8ae1	Add some llama-v2 variants. (#545 )	2023-08-22 08:35:15 +01:00
Laurent Mazare	f16bb97401	Use the yolo-v8 weights from the hub. (#544 ) * Use the weights from the hub. * Add to the readme.	2023-08-21 22:07:36 +01:00
Laurent Mazare	3507e14c0c	Yolo v8 fixes (#542 ) * Fixes for the yolo-v8 layout. * Bugfixes. * Another silly bugfix. * Remove the hf-hub dependency. * Remove the transformers dependency.	2023-08-21 21:05:40 +01:00
Laurent Mazare	de50e66af1	Add yolo v8 as an example (#541 ) * Sketching yolo-v8. * Get the model to load. * yolo-v8 forward pass. * Complete(?) the forward pass. * Fix some shape issues. * Add the missing padding. * Process the predictions.	2023-08-21 18:40:09 +01:00
Laurent Mazare	cc2d6cf2e0	Improve the timestamps support in whisper (#539 ) * Timestamp support for whisper. * Properly display the timestamps. * Bugfix for the timestamp units.	2023-08-21 12:26:59 +01:00
Laurent Mazare	e3b71851e6	Retrieve the yolo-v3 weights from the hub. (#537 )	2023-08-21 10:55:09 +01:00
Laurent Mazare	4300864ce9	Add some optional repeat penalty. (#535 )	2023-08-21 09:59:13 +01:00
Laurent Mazare	d70cffdab6	Fix the minimum/maximum gradient computations. (#534 )	2023-08-21 08:28:41 +01:00
Laurent Mazare	912561614f	Better handling of zero temperatures. (#532 )	2023-08-21 07:51:46 +01:00
Laurent Mazare	8c232d706b	Small tweaks to the pickle handling to be able to use libtorch files. (#530 ) * Small tweaks to the pickle handling to be able to use libtorch files. * Move the pytorch specific bits in a different function.	2023-08-20 23:25:34 +01:00
Laurent Mazare	11c7e7bd67	Some fixes for yolo-v3. (#529 ) * Some fixes for yolo-v3. * Use the running stats for inference in the batch-norm layer. * Get some proper predictions for yolo. * Avoid the quadratic insertion.	2023-08-20 23:19:15 +01:00
Laurent Mazare	a1812f934f	Add a yolo-v3 example. (#528 ) * Add a couple functions required for yolo. * Add the yolo-v3 example. * Add minimum and maximum. * Use the newly introduced maximum. * Cuda support for min/max + add some testing. * Allow for more tests to work with accelerate. * Fix a typo.	2023-08-20 18:19:37 +01:00
Laurent Mazare	e3d2786ffb	Add a couple functions required for yolo. (#527 )	2023-08-20 17:02:05 +01:00
Laurent Mazare	372f8912c5	Minor readme tweaks. (#526 )	2023-08-20 14:33:21 +01:00
Laurent Mazare	d2622a8160	Move the VarMap to a separate file (#525 ) * Move the var-map struct in a separate file. * Fix some typos.	2023-08-20 14:25:07 +01:00
Laurent Mazare	2fcb386f17	Add a broadcast variant to matmul. (#523 ) * Add a broadcast variant to matmul. * Get the test to pass.	2023-08-20 13:20:42 +01:00
Laurent Mazare	a8f61e66cc	Bump the crates version to 0.1.2. (#522 )	2023-08-20 08:07:07 +01:00
Laurent Mazare	aa207f2dd9	Print some per-step timings in stable-diffusion. (#520 ) * Skeleton files for neon support of quantization. * SIMD version for q4 vecdot. * Also simdify the q6k multiplication. * Add some timings to stable-diffusion.	2023-08-20 05:45:12 +01:00
Laurent Mazare	82410995a2	Neon support for quantization. (#519 ) * Skeleton files for neon support of quantization. * SIMD version for q4 vecdot. * Also simdify the q6k multiplication.	2023-08-19 22:07:29 +01:00
Laurent Mazare	d73ca3d28e	Line up the llama.cpp implementation with the candle one. (#518 ) * Separate the prompt stats from the post-prompt ones in the quantized example. * Slightly nicer output printing. * Line up with the llama.cpp implementation.	2023-08-19 20:12:07 +01:00
Laurent Mazare	551409092e	Small tweaks to tensor-tools. (#517 )	2023-08-19 16:50:26 +01:00
Laurent Mazare	6431140250	Retrieve tensor data from PyTorch files. (#516 )	2023-08-19 15:57:18 +01:00
Laurent Mazare	607ffb9f1e	Retrieve more information from PyTorch checkpoints. (#515 ) * Retrieve more information from PyTorch checkpoints. * Add enough support to load dino-v2 backbone weights.	2023-08-19 15:05:34 +01:00
Laurent Mazare	f861a9df6e	Add ggml support to tensor-tools (#512 ) * Pickle work-in-progress. * More unpickling. * More pickling. * Proper handling of setitems. * Clippy. * Again more pickling. * Restore the example. * Add enough pickle support to get the list of tensors. * Read the data from zip files. * Retrieve the tensor shape. * Extract the size and dtype. * More storage types. * Improve the destructuring. * Also support ggml files.	2023-08-19 11:45:22 +01:00
Laurent Mazare	ad33715c61	Preliminary support for importing PyTorch weights. (#511 ) * Pickle work-in-progress. * More unpickling. * More pickling. * Proper handling of setitems. * Clippy. * Again more pickling. * Restore the example. * Add enough pickle support to get the list of tensors. * Read the data from zip files. * Retrieve the tensor shape. * Extract the size and dtype. * More storage types. * Improve the destructuring.	2023-08-19 11:26:32 +01:00
Laurent Mazare	90ff04e77e	Add the tensor-tools binary. (#510 )	2023-08-19 09:06:44 +01:00
Laurent Mazare	42e1cc8062	Add a batch normalization layer (#508 ) * Add BatchNormalization. * More batch-norm. * Add some validation of the inputs. * More validation.	2023-08-18 20:05:56 +01:00
Laurent Mazare	b64e782c2d	Use the hub to retrieve dinov2 model weights. (#507 )	2023-08-18 18:27:31 +01:00
Laurent Mazare	e5dd5fd1b3	Print the recognized categories in dino-v2. (#506 )	2023-08-18 17:32:58 +01:00
Laurent Mazare	cb069d6063	Add the permute op (similar to pytorch). (#504 ) * Add the permute op (similar to pytorch). * Add the backprop for dimension permutation.	2023-08-18 16:30:53 +01:00
Laurent Mazare	4f1541526c	dinov2 - read images from disk and compute the class probabilities (#503 ) * Load the image from disk and convert it to a tensor. * Tweak the function name.	2023-08-18 15:50:33 +01:00
Laurent Mazare	95462c6a2e	Add a vision transformer example (dino-v2). (#502 ) * Add a vision transformer example (dino-v2). * Add some documentation + test. * CI fix. * Another fix (still unable to replicate the errors locally :( )	2023-08-18 11:58:06 +01:00
Laurent Mazare	b9661a1c25	Enable the image crate by default in examples (#501 ) * Enable the image crate by default so that it's easier to compile the stable diffusion example. * Also update the readme.	2023-08-18 10:00:05 +01:00
Lukas Kreussel	109e95b189	Basic `qmatmul` parallelization (#492 ) * Basic `par_iter` parallelization * Pass errors up * Disable `avx` for x86 macs	2023-08-18 09:45:37 +01:00
Laurent Mazare	c78ce76501	Add a simple Module trait and implement it for the various nn layers (#500 ) * Start adding the module trait. * Use the module trait. * Implement module for qmatmul.	2023-08-18 09:38:22 +01:00
Laurent Mazare	13401df4d1	Add an abstract type for RmsNorm. (#499 )	2023-08-18 08:52:14 +01:00
Laurent Mazare	a22b1bed7b	Tensor -> QTensor conversion (#496 ) * Sketch some qmatmul test. * Add the quantization function. * More testing. * Make the test smaller and faster. * Add some shape checking.	2023-08-18 08:19:20 +01:00
Laurent Mazare	26fd37b348	Use the main branch of the HF repo where possible. (#498 ) * Use the main branch of the HF repo where possible. * And add the large model.	2023-08-18 08:18:30 +01:00
Franco Lucchini	f056dcab21	Add medium model (#497 )	2023-08-18 08:08:59 +01:00
Laurent Mazare	557b2c28dd	Q6K quantization (#495 ) * Print the detected arch options. * Add the q6k quantization. * Add a currently broken test. * Bugfix. * Bugfix. * Another bugfix. * Another bugfix + get the test to work.	2023-08-17 22:22:57 +01:00
Laurent Mazare	fc81af1712	AVX version of the q6k vec-dot. (#493 ) * AVX version of the q6k vec-dot. * Use the avx sum.	2023-08-17 20:13:18 +01:00
Laurent Mazare	3164cd24fa	Replicate the sot-token logic from the Python implementation more acc… (#491 ) * Replicate the sot-token logic from the Python implementation more accurately. * Add a flag to control the timestamp mode.	2023-08-17 16:59:36 +01:00
Laurent Mazare	5f30c1e1e0	Add the whisper small model. (#490 )	2023-08-17 15:48:34 +01:00
Laurent Mazare	ad7c53953b	Add a verbose-prompt mode, similar to llama.cpp. (#489 )	2023-08-17 15:26:44 +01:00
Laurent Mazare	5d99026fd2	F16 support for stable diffusion (#488 ) * F16 support for stable diffusion. * Keep the attention bits in F32. * Keep more of the attention bits in F32. * More mixed precision support.	2023-08-17 13:48:56 +01:00
Laurent Mazare	c3176f0dfb	Flash-attention support in stable diffusion (#487 ) * Add flash-attention for the stable-diffusion example. * Change the dtype. * Silly fix. * Another fix. * Revert the dtype back to the query dtype after apply flash-attn.	2023-08-17 12:16:40 +01:00
Laurent Mazare	03be33eea4	Relax the requirements on CustomOp. (#486 ) * Relax the requirements on CustomOp. * Simplify the custom-ops when no backward is required.	2023-08-17 11:12:05 +01:00
Laurent Mazare	d32e8199cd	Layer norm tweaks (#482 ) * Add some options to make layer-norm more configurable. * Add the rms-norm variant. * Replace the RmsNorm with the shared bits.	2023-08-17 10:07:13 +01:00
Laurent Mazare	d99cac3ec3	Move the avx specific bits to a separate file. (#481 )	2023-08-17 09:01:06 +01:00
Laurent Mazare	f708efb19c	Add some accelerate details on the readme. (#480 )	2023-08-17 08:26:02 +01:00

1 2 3 4 5 ...

921 Commits