Commit Graph

  • 90374097dc Cudnn support (#445) Laurent Mazare 2023-08-14 21:30:41 +01:00
  • c84883ecf2 Add a cuda kernel for upsampling. (#441) Laurent Mazare 2023-08-14 13:12:17 +01:00
  • a094dc503d Add a cuda kernel for avg-pool2d. (#440) Laurent Mazare 2023-08-14 12:32:05 +01:00
  • 34f4b3187e Add a naive conv2d cuda kernel. (#438) Laurent Mazare 2023-08-14 10:34:42 +01:00
  • eab54e4490 Fix the tests for mkl. (#437) Laurent Mazare 2023-08-14 08:09:27 +01:00
  • 9e7e6e0288 Add dequantization for ggmls q4_0, q4_1, q5_0, q5_1 and q8_0 (#407) Lukas Kreussel 2023-08-14 00:22:57 +02:00
  • 8bd2b22b33 Optimize the logit computations in the whisper example. (#434) Laurent Mazare 2023-08-13 23:00:13 +02:00
  • d379a76a9e Add a softmax bench. (#433) Laurent Mazare 2023-08-13 21:09:18 +02:00
  • 9af438ac1b Track the conv2d operations in stable-diffusion. (#431) Laurent Mazare 2023-08-13 16:58:26 +02:00
  • b1ff78f762 Allow using accelerate with stable-diffusion. (#430) Laurent Mazare 2023-08-13 15:14:20 +02:00
  • 5a63b51f14 Add a matmul benchmark. (#429) Laurent Mazare 2023-08-13 14:41:03 +02:00
  • 6d694554b8 Support longer sequences in language detection. (#428) Laurent Mazare 2023-08-13 14:16:15 +02:00
  • 9aca398a4f More accelerate optimizations (#427) Laurent Mazare 2023-08-13 13:53:34 +02:00
  • 60cd1551ca Add a KV cache to whisper. (#426) Laurent Mazare 2023-08-12 22:17:08 +02:00
  • a0908d212c Add a -language argument. (#425) Laurent Mazare 2023-08-12 18:08:40 +02:00
  • 972078e1ae Update the readme with the discord server and common errors. (#423) Laurent Mazare 2023-08-12 17:45:58 +02:00
  • 16b89f5b83 fix: can directly save the loaded weights (#421) Yumin Wu 2023-08-12 23:33:29 +08:00
  • 0741ebbd51 More multilingual support for whisper. (#419) Laurent Mazare 2023-08-12 16:32:52 +02:00
  • 0c3f109faa Basic multilingual support for whisper (#417) Laurent Mazare 2023-08-12 12:23:04 +02:00
  • 2ba6b2826f Fix the readme instructions for stable-diffusion. (#415) Laurent Mazare 2023-08-11 19:59:04 +02:00
  • 1d0157bbc4 Stable diffusion: retrieve the model files from the HF hub. (#414) Laurent Mazare 2023-08-11 19:57:06 +02:00
  • 91dbf907d3 Add more whisper variants. (#413) Laurent Mazare 2023-08-11 18:33:55 +02:00
  • e12372021b Expose the tensor write-bytes function. (#412) Laurent Mazare 2023-08-11 18:13:42 +02:00
  • 55e428c8ae Expose the varmap inner data. (#411) Laurent Mazare 2023-08-11 17:58:56 +02:00
  • 01ea57da8c Fix the conv tests. (#409) Laurent Mazare 2023-08-11 15:59:54 +02:00
  • 662db45fc3 Use zero padding in conv1d and conv2d (same as pytorch). (#408) Laurent Mazare 2023-08-11 15:53:05 +02:00
  • 906c0f3eb5 Remove the checkpoint conversion script. (#405) Laurent Mazare 2023-08-11 06:59:48 +02:00
  • e29c7809ec Parallelise the CPU kernels for the conv ops. (#401) Laurent Mazare 2023-08-11 06:51:58 +02:00
  • a325c1aa50 Upsample test + bugfix. (#399) Laurent Mazare 2023-08-10 21:02:35 +02:00
  • b6cf26e48e Merge pull request #393 from huggingface/older_gpus Nicolas Patry 2023-08-10 20:49:23 +02:00
  • 379eadc68e Working now. Nicolas Patry 2023-08-10 19:43:25 +02:00
  • 7e4fbc1e17 [DO NOT MERGE] temporary PR so users can try out on older GPUs. Nicolas Patry 2023-08-10 16:30:44 +02:00
  • 80f0482f26 Fix the stable-diffusion vae. (#398) Laurent Mazare 2023-08-10 19:24:31 +02:00
  • 94eff56aee Optimize the cpu conv2d kernel (#396) Laurent Mazare 2023-08-10 18:40:09 +02:00
  • a55133effd Merge pull request #395 from huggingface/fix_compat_windows Nicolas Patry 2023-08-10 18:05:12 +02:00
  • ff53f38467 Small example for benchmarking some cpu ops (#394) Laurent Mazare 2023-08-10 18:00:17 +02:00
  • 4a95d34c83 Compat windows. Nicolas Patry 2023-08-10 17:46:47 +02:00
  • 7f710a573d Merge pull request #374 from Rocketknight1/readme_fixes Nicolas Patry 2023-08-10 16:34:19 +02:00
  • c8039579a5 Conv1d optimize (#392) Laurent Mazare 2023-08-10 16:23:52 +02:00
  • 0b0fa56978 Merge pull request #386 from huggingface/enabling_61_maybe Nicolas Patry 2023-08-10 16:23:17 +02:00
  • 385f0d261c Normalize embeddings in the bert example. (#390) Laurent Mazare 2023-08-10 14:05:55 +02:00
  • b765f2c37f Update the wasm build instructions. (#389) Laurent Mazare 2023-08-10 12:29:43 +02:00
  • 66d1c093e0 This is duplicated code on Cuda 12.2. Nicolas Patry 2023-08-10 09:20:18 +02:00
  • de7c31bfe9 Merge pull request #368 from huggingface/add_cuda_ci Nicolas Patry 2023-08-10 08:49:39 +02:00
  • 8e7ef96588 Fix CI cuda. Nicolas Patry 2023-08-10 08:47:15 +02:00
  • f3fe730a30 Npy tweaks & error with path (#384) Laurent Mazare 2023-08-10 07:21:58 +02:00
  • c7f92f985e Further randn tweaks: use the appropriate rng rather than the f64 one, some cleanup. (#383) Laurent Mazare 2023-08-10 06:48:19 +02:00
  • 3bbc08a8df Fix randn cpu (#382) Lei 2023-08-10 00:33:44 -04:00
  • 6a2137af4f Update README.md Matt 2023-08-10 00:19:58 +01:00
  • 0dc1e5f387 Merge branch 'main' into readme_fixes Matt 2023-08-10 00:19:20 +01:00
  • bd2fb6216b Testing in release mode because debug is too slow. Nicolas Patry 2023-08-09 23:19:55 +02:00
  • 3542b26143 ssl update. Nicolas Patry 2023-08-09 23:11:45 +02:00
  • a690f14a77 Fix by hardcoding paths Nicolas Patry 2023-08-09 23:08:50 +02:00
  • 90d778c059 ? Nicolas Patry 2023-08-09 23:02:11 +02:00
  • 171fcbe539 CI ssh in the meantime. Nicolas Patry 2023-08-09 22:58:47 +02:00
  • 07e83c55c0 Attempt nb2 Nicolas Patry 2023-08-09 22:47:01 +02:00
  • 25ec2d9f6b fix: remove incorrect unwrap (#379) Ciarán Curley 2023-08-09 21:45:24 +01:00
  • da26e2832c Update gemm to 0.15.6. (#378) Laurent Mazare 2023-08-09 22:04:28 +02:00
  • fcfdcbd337 Add a conv1d benchmark based on the whisper sizes. (#377) Laurent Mazare 2023-08-09 21:27:03 +02:00
  • 653ec5abc1 Update README.md (#376) Philipp Parzer 2023-08-09 21:09:21 +02:00
  • c3a0761e62 Add some tracing to the whisper example. (#375) Laurent Mazare 2023-08-09 20:58:36 +02:00
  • 0cef3998fd README.md typos and grammar fixes Matt 2023-08-09 19:36:03 +01:00
  • e5f510d209 SSH to debug. Nicolas Patry 2023-08-09 19:54:40 +02:00
  • 0dd94eff4c Merge pull request #367 from eltociear/eltociear-patch-1 Nicolas Patry 2023-08-09 19:48:31 +02:00
  • a3b1699409 Embed the mel filters in the whisper binary. (#373) Laurent Mazare 2023-08-09 19:27:26 +02:00
  • 5b79b38bc7 Remove extra square bracket (#372) Gabriel Martín Blázquez 2023-08-09 19:24:28 +02:00
  • a5c5a893aa add max_pool2d (#371) LeeeSe 2023-08-10 01:05:26 +08:00
  • e6ce47f9e0 ? Nicolas Patry 2023-08-09 19:00:25 +02:00
  • 1892bd139c Extract the strides in the conv ops. (#370) Laurent Mazare 2023-08-09 18:57:05 +02:00
  • 749c8c7f51 Better rust GH action. Nicolas Patry 2023-08-09 18:42:53 +02:00
  • d9b4fef189 Chnage name Nicolas Patry 2023-08-09 18:14:29 +02:00
  • 8fa329aca2 Adding cuda CI Nicolas Patry 2023-08-09 18:13:27 +02:00
  • cd225bd3b1 More testing for avg-pool2d. (#366) Laurent Mazare 2023-08-09 17:12:23 +02:00
  • a4f6977087 Update README.md Ikko Eltociear Ashimine 2023-08-10 00:11:11 +09:00
  • dece0b8a76 Merge pull request #263 from huggingface/book_3 Nicolas Patry 2023-08-09 16:50:11 +02:00
  • b80348d22f Bugfix for avg-pool + add some test. (#365) Laurent Mazare 2023-08-09 16:44:16 +02:00
  • 3a62aee91f Write the generated images using the image crate. (#363) Laurent Mazare 2023-08-09 16:26:44 +02:00
  • be21d7e75a Fix the padding used in stable diffusion. (#362) Laurent Mazare 2023-08-09 14:23:59 +02:00
  • 9c4cf6804b Merge pull request #355 from cksac/fix_book Nicolas Patry 2023-08-09 09:08:16 +02:00
  • dbc6f281c9 Conv1d test with padding. (#356) Laurent Mazare 2023-08-09 06:45:38 +02:00
  • 47a5bee249 fix repo link choisi 2023-08-09 11:29:48 +08:00
  • cf965ecaa8 Simplify the conv1d and conv2d code. (#352) Laurent Mazare 2023-08-08 23:10:59 +02:00
  • b9864e1357 Fix size-in-bytes for u8. (#351) Laurent Mazare 2023-08-08 22:15:18 +02:00
  • 608b2358c6 Add some conv1d test + bugfix using padding. (#349) Laurent Mazare 2023-08-08 21:50:20 +02:00
  • 1e6dbeac01 Add some conv2d tests. (#347) Laurent Mazare 2023-08-08 20:02:42 +02:00
  • 13ce68ff9b Bugfix for conv2d. (#343) Laurent Mazare 2023-08-08 16:20:00 +02:00
  • 89d3926c9b Fixes for the stable diffusion example. (#342) Laurent Mazare 2023-08-08 15:57:09 +02:00
  • ab35684326 Naive implementation for conv2d. (#341) Laurent Mazare 2023-08-08 07:34:36 +02:00
  • b5bb5e056d Add more conv2d support. (#340) Laurent Mazare 2023-08-08 07:04:32 +02:00
  • d0d7010682 CPU implementation for upsample-nearest2d. (#339) Laurent Mazare 2023-08-07 21:07:10 +02:00
  • fc265d9dcf Some CLIP fixes for stable diffusion. (#338) Laurent Mazare 2023-08-07 19:31:45 +02:00
  • 2345b8ce3f Skeleton for the avg-pool2d and upsample-nearest2d ops. (#337) Laurent Mazare 2023-08-07 17:15:38 +02:00
  • f53a333ea9 Simple pad support. (#336) Laurent Mazare 2023-08-07 16:24:56 +02:00
  • e72ba0b9e7 Add the license files. (#335) Laurent Mazare 2023-08-07 15:11:27 +02:00
  • 5bb2fce998 Implement group-norm. (#334) Laurent Mazare 2023-08-07 07:53:05 +02:00
  • 2c9f605976 Add rand-like/randn-like. (#333) Laurent Mazare 2023-08-06 22:51:08 +02:00
  • 141df4ad2b Main diffusion loop for the SD example. (#332) Laurent Mazare 2023-08-06 22:39:53 +02:00
  • 166bfd5847 Add the recip op + use it in stable-diffusion. (#331) Laurent Mazare 2023-08-06 22:14:52 +02:00
  • 1c062bf06b Add the ddim scheduler. (#330) Laurent Mazare 2023-08-06 21:44:00 +02:00
  • d34039e352 Add a stable diffusion example (#328) Laurent Mazare 2023-08-06 18:49:43 +02:00