Commit Graph

  • 9a5c7db91a Add support for i64 (#563) Laurent Mazare 2023-08-23 10:42:19 +01:00
  • 649202024c fix code snippets Patrick von Platen 2023-08-23 09:05:07 +00:00
  • 283f6c048d fix code snippets Patrick von Platen 2023-08-23 09:04:36 +00:00
  • c8211fc474 fix code snippets Patrick von Platen 2023-08-23 09:04:08 +00:00
  • 7732bf6238 correct Patrick von Platen 2023-08-23 08:54:48 +00:00
  • 7c0ca80d3a move installation to book Patrick von Platen 2023-08-23 08:52:53 +00:00
  • b558d08b85 improve Patrick von Platen 2023-08-23 08:42:47 +00:00
  • 34cb9f924f improve Patrick von Platen 2023-08-23 08:40:23 +00:00
  • d4968295a0 improve Patrick von Platen 2023-08-23 08:37:08 +00:00
  • 65e146c72d Add installation section Patrick von Platen 2023-08-23 08:32:59 +00:00
  • 3743bed2d7 Fix the ? operator cannot be applied to type Device of example (#560) Patrick von Platen 2023-08-23 10:29:50 +02:00
  • 508d34daf2 GGUF support in the quantized model. (#559) Laurent Mazare 2023-08-23 09:20:57 +01:00
  • 0764741cc4 Handle GGUF files in tensor-tools. (#558) Laurent Mazare 2023-08-23 06:32:07 +01:00
  • 6a30ecefad Preliminary GGUF support. (#557) Laurent Mazare 2023-08-23 00:14:10 +01:00
  • 7687a0f453 Also fix the aspect ratio in the wasm example. (#556) Laurent Mazare 2023-08-22 22:20:08 +01:00
  • f9ecc84477 GQA support in the quantized model. (#555) Laurent Mazare 2023-08-22 19:41:10 +01:00
  • 07067b01dc Avoid some mutable variables (take 2). (#554) Laurent Mazare 2023-08-22 18:51:20 +01:00
  • cc22d4db20 Put the transcribe token before the language one. (#553) Laurent Mazare 2023-08-22 16:46:34 +01:00
  • ec665acad7 Revert "Avoid some mut in quantized functions. (#550)" (#552) Laurent Mazare 2023-08-22 15:57:46 +01:00
  • cf27b9b636 Avoid some mut in quantized functions. (#550) Laurent Mazare 2023-08-22 15:44:26 +01:00
  • 352383cbc3 Add quantization support for q2k, q3k, q4k and q5k (#524) Lukas Kreussel 2023-08-22 16:04:55 +02:00
  • 9bc811a247 Improve the aspect ratio handling on yolo-v8. (#549) Laurent Mazare 2023-08-22 14:55:33 +01:00
  • bb69d89e28 Move the yolo shared bits to a common place. (#548) Laurent Mazare 2023-08-22 13:03:07 +01:00
  • 20ce3e9f39 Sketch the yolo wasm example. (#546) Laurent Mazare 2023-08-22 11:56:43 +01:00
  • 44420d8ae1 Add some llama-v2 variants. (#545) Laurent Mazare 2023-08-22 08:35:15 +01:00
  • f16bb97401 Use the yolo-v8 weights from the hub. (#544) Laurent Mazare 2023-08-21 22:07:36 +01:00
  • 3507e14c0c Yolo v8 fixes (#542) Laurent Mazare 2023-08-21 21:05:40 +01:00
  • de50e66af1 Add yolo v8 as an example (#541) Laurent Mazare 2023-08-21 18:40:09 +01:00
  • cc2d6cf2e0 Improve the timestamps support in whisper (#539) Laurent Mazare 2023-08-21 12:26:59 +01:00
  • e3b71851e6 Retrieve the yolo-v3 weights from the hub. (#537) Laurent Mazare 2023-08-21 10:55:09 +01:00
  • 4300864ce9 Add some optional repeat penalty. (#535) Laurent Mazare 2023-08-21 09:59:13 +01:00
  • d70cffdab6 Fix the minimum/maximum gradient computations. (#534) Laurent Mazare 2023-08-21 08:28:41 +01:00
  • 912561614f Better handling of zero temperatures. (#532) Laurent Mazare 2023-08-21 07:51:46 +01:00
  • 8c232d706b Small tweaks to the pickle handling to be able to use libtorch files. (#530) Laurent Mazare 2023-08-20 23:25:34 +01:00
  • 11c7e7bd67 Some fixes for yolo-v3. (#529) Laurent Mazare 2023-08-20 23:19:15 +01:00
  • a1812f934f Add a yolo-v3 example. (#528) Laurent Mazare 2023-08-20 18:19:37 +01:00
  • e3d2786ffb Add a couple functions required for yolo. (#527) Laurent Mazare 2023-08-20 17:02:05 +01:00
  • 372f8912c5 Minor readme tweaks. (#526) Laurent Mazare 2023-08-20 14:33:21 +01:00
  • d2622a8160 Move the VarMap to a separate file (#525) Laurent Mazare 2023-08-20 14:25:07 +01:00
  • 2fcb386f17 Add a broadcast variant to matmul. (#523) Laurent Mazare 2023-08-20 13:20:42 +01:00
  • a8f61e66cc Bump the crates version to 0.1.2. (#522) Laurent Mazare 2023-08-20 08:07:07 +01:00
  • aa207f2dd9 Print some per-step timings in stable-diffusion. (#520) Laurent Mazare 2023-08-20 05:45:12 +01:00
  • 82410995a2 Neon support for quantization. (#519) Laurent Mazare 2023-08-19 22:07:29 +01:00
  • d73ca3d28e Line up the llama.cpp implementation with the candle one. (#518) Laurent Mazare 2023-08-19 20:12:07 +01:00
  • 551409092e Small tweaks to tensor-tools. (#517) Laurent Mazare 2023-08-19 16:50:26 +01:00
  • 6431140250 Retrieve tensor data from PyTorch files. (#516) Laurent Mazare 2023-08-19 15:57:18 +01:00
  • 607ffb9f1e Retrieve more information from PyTorch checkpoints. (#515) Laurent Mazare 2023-08-19 15:05:34 +01:00
  • f861a9df6e Add ggml support to tensor-tools (#512) Laurent Mazare 2023-08-19 11:45:22 +01:00
  • ad33715c61 Preliminary support for importing PyTorch weights. (#511) Laurent Mazare 2023-08-19 11:26:32 +01:00
  • 90ff04e77e Add the tensor-tools binary. (#510) Laurent Mazare 2023-08-19 09:06:44 +01:00
  • 42e1cc8062 Add a batch normalization layer (#508) Laurent Mazare 2023-08-18 20:05:56 +01:00
  • b64e782c2d Use the hub to retrieve dinov2 model weights. (#507) Laurent Mazare 2023-08-18 18:27:31 +01:00
  • e5dd5fd1b3 Print the recognized categories in dino-v2. (#506) Laurent Mazare 2023-08-18 17:32:58 +01:00
  • cb069d6063 Add the permute op (similar to pytorch). (#504) Laurent Mazare 2023-08-18 16:30:53 +01:00
  • 4f1541526c dinov2 - read images from disk and compute the class probabilities (#503) Laurent Mazare 2023-08-18 15:50:33 +01:00
  • 95462c6a2e Add a vision transformer example (dino-v2). (#502) Laurent Mazare 2023-08-18 11:58:06 +01:00
  • b9661a1c25 Enable the image crate by default in examples (#501) Laurent Mazare 2023-08-18 10:00:05 +01:00
  • 109e95b189 Basic qmatmul parallelization (#492) Lukas Kreussel 2023-08-18 10:45:37 +02:00
  • c78ce76501 Add a simple Module trait and implement it for the various nn layers (#500) Laurent Mazare 2023-08-18 09:38:22 +01:00
  • 13401df4d1 Add an abstract type for RmsNorm. (#499) Laurent Mazare 2023-08-18 08:52:14 +01:00
  • a22b1bed7b Tensor -> QTensor conversion (#496) Laurent Mazare 2023-08-18 08:19:20 +01:00
  • 26fd37b348 Use the main branch of the HF repo where possible. (#498) Laurent Mazare 2023-08-18 08:18:30 +01:00
  • f056dcab21 Add medium model (#497) Franco Lucchini 2023-08-18 09:08:59 +02:00
  • 557b2c28dd Q6K quantization (#495) Laurent Mazare 2023-08-17 22:22:57 +01:00
  • fc81af1712 AVX version of the q6k vec-dot. (#493) Laurent Mazare 2023-08-17 20:13:18 +01:00
  • 3164cd24fa Replicate the sot-token logic from the Python implementation more acc… (#491) Laurent Mazare 2023-08-17 16:59:36 +01:00
  • 5f30c1e1e0 Add the whisper small model. (#490) Laurent Mazare 2023-08-17 15:48:34 +01:00
  • ad7c53953b Add a verbose-prompt mode, similar to llama.cpp. (#489) Laurent Mazare 2023-08-17 15:26:44 +01:00
  • 5d99026fd2 F16 support for stable diffusion (#488) Laurent Mazare 2023-08-17 13:48:56 +01:00
  • c3176f0dfb Flash-attention support in stable diffusion (#487) Laurent Mazare 2023-08-17 12:16:40 +01:00
  • 03be33eea4 Relax the requirements on CustomOp. (#486) Laurent Mazare 2023-08-17 11:12:05 +01:00
  • d32e8199cd Layer norm tweaks (#482) Laurent Mazare 2023-08-17 10:07:13 +01:00
  • d99cac3ec3 Move the avx specific bits to a separate file. (#481) Laurent Mazare 2023-08-17 09:01:06 +01:00
  • f708efb19c Add some accelerate details on the readme. (#480) Laurent Mazare 2023-08-17 08:26:02 +01:00
  • 306c8eee7a AVX version of the vecdot for q4_0. (#474) Laurent Mazare 2023-08-17 07:03:32 +01:00
  • 098909de40 Add vecdot for q6k-q8k. (#476) Laurent Mazare 2023-08-16 20:59:40 +01:00
  • 3bedba1fce Use a zipped iterator. (#475) Laurent Mazare 2023-08-16 20:15:11 +01:00
  • c5f45887dc Add some tracing to the quantized example. (#473) Laurent Mazare 2023-08-16 18:49:08 +01:00
  • fa4590d7fd Merge pull request #469 from huggingface/fix_llama_v1 Nicolas Patry 2023-08-16 17:47:40 +02:00
  • 2e206e269d Add the model argument. (#471) Laurent Mazare 2023-08-16 16:41:06 +01:00
  • 575e88a999 Add a quantized test that use negative values. (#470) Laurent Mazare 2023-08-16 16:32:58 +01:00
  • a9101700b6 Add a kv-cache to the quantized llama example. (#466) Laurent Mazare 2023-08-16 14:28:42 +01:00
  • 102fa4c2e3 Fixing llamav1 Nicolas Patry 2023-08-16 14:53:29 +02:00
  • 0bb344f798 [RFC] Start removal of VarBuilder. initializer Nicolas Patry 2023-08-16 14:39:36 +02:00
  • 3071134788 Get the ggml based llama to generate some text. (#464) Laurent Mazare 2023-08-16 12:41:07 +01:00
  • fec87e86f5 Merge pull request #465 from huggingface/llama_hub_config Nicolas Patry 2023-08-16 13:28:59 +02:00
  • 33c882ea74 Clippy. Nicolas Patry 2023-08-16 10:41:00 +02:00
  • 76804730c6 Using the real config from the hub when available. Nicolas Patry 2023-08-16 10:29:46 +02:00
  • 965597a873 Add a test for qmatmul. (#459) Laurent Mazare 2023-08-16 06:36:27 +01:00
  • ca449f9ee1 Add quantized tensors. (#458) Laurent Mazare 2023-08-15 22:45:53 +01:00
  • b8263aa15c Quantized support for f16 and f32 (#457) Laurent Mazare 2023-08-15 21:09:37 +01:00
  • e68b2accb4 Split out the quantized file. (#456) Laurent Mazare 2023-08-15 20:26:27 +01:00
  • 08effe3762 More quantization support (#455) Laurent Mazare 2023-08-15 18:58:04 +01:00
  • 8ad4a21ffc Add a basic optimizer example. (#454) Laurent Mazare 2023-08-15 17:19:18 +01:00
  • 5e49922be2 Basic quantization support (#453) Laurent Mazare 2023-08-15 15:53:19 +01:00
  • ebcfd96d94 add c++17 flags (#452) Chengxu Yang 2023-08-15 22:29:34 +08:00
  • 5b1690fffa Tweak the llama example. (#450) Laurent Mazare 2023-08-15 12:18:20 +01:00
  • 3cc87058b7 Support local weights & dynamic outputs (#447) Guoqing Bao 2023-08-15 18:51:57 +08:00
  • 531f23b4d0 Rename vec-dot to vec-ops. (#449) Laurent Mazare 2023-08-15 10:48:57 +01:00
  • 495e0b7580 Simd support (#448) Laurent Mazare 2023-08-15 09:50:38 +01:00