Commit Graph

  • 93cfe5642f Pyo3 dtype (#327) Laurent Mazare 2023-08-06 10:17:43 +01:00
  • 88bd3b604a Add some tensor creation functions to the pyo3 bindings. (#326) Laurent Mazare 2023-08-06 06:50:33 +01:00
  • b278834267 Support the Accelerate BLAS on macOS. (#325) Laurent Mazare 2023-08-05 17:25:24 +01:00
  • 0b175fcbbd Fix the pyo3 build for macos. (#324) Laurent Mazare 2023-08-05 14:53:57 +01:00
  • c6ae9f565e Merge remote-tracking branch 'origin/main' into faster-gemv faster-gemv laurent 2023-08-05 10:47:10 +01:00
  • 620f83cf66 Add the candle-datasets crate (#322) Laurent Mazare 2023-08-05 08:56:50 +01:00
  • 3fa3623135 Faster matmul when we can fall back to gemv. laurent 2023-08-04 22:44:30 +01:00
  • f7b2a0391d Transpose the weight matrixes for llama2.c. (#321) Laurent Mazare 2023-08-04 13:32:20 +01:00
  • 8b6f5be1cc Support q5k quantized data. (#320) Laurent Mazare 2023-08-04 09:51:30 +01:00
  • a35a935118 Transpose the rhs in linear. linear-transpose laurent 2023-08-03 16:51:24 +01:00
  • df6667ba88 Add some tracing to llama. (#318) Laurent Mazare 2023-08-03 13:52:22 +01:00
  • a79286885c Support safetensors weights in llama2.c inference. (#317) Laurent Mazare 2023-08-03 11:10:58 +01:00
  • 74845a4dcd Use the assert! function as it turns out to be const. (#316) Laurent Mazare 2023-08-03 10:03:43 +01:00
  • aa76b783eb Q6K dequantization. (#315) Laurent Mazare 2023-08-03 09:31:20 +01:00
  • 25564357f7 Support some ggml quantized types (#314) Laurent Mazare 2023-08-03 09:16:26 +01:00
  • 634700d84a Use some consts for ggml values. (#312) Laurent Mazare 2023-08-02 22:03:05 +01:00
  • e635f18eda Initial support for reading ggml files. (#311) Laurent Mazare 2023-08-02 21:59:02 +01:00
  • dba31473d4 Typos and format and CD only when PR lands. Nicolas Patry 2023-08-02 19:18:43 +02:00
  • 1b2b32e58d Remove dead page.t Nicolas Patry 2023-08-02 18:59:36 +02:00
  • 166f4d1101 s/candle/candle_core/g Nicolas Patry 2023-08-02 18:35:31 +02:00
  • ae68635af9 Add small error management. Nicolas Patry 2023-08-02 18:16:50 +02:00
  • c11e78b334 Odd rebase artifact. Nicolas Patry 2023-08-02 09:22:27 +02:00
  • 1b705a426f Remove duplicate. Nicolas Patry 2023-08-02 09:21:44 +02:00
  • a70b95f9e7 Marking unwritten chapters as Draft (disables the link). Nicolas Patry 2023-08-01 16:49:35 +02:00
  • a44471a305 Adding more details on how to load things. Nicolas Patry 2023-08-01 16:36:53 +02:00
  • 45642a8530 Fixing examples. Nicolas Patry 2023-08-01 15:04:41 +02:00
  • 82464166e4 3rd phase. Nicolas Patry 2023-07-28 12:07:39 +02:00
  • 52414ba5c8 Bugfix for the llama2 wasm example. (#310) Laurent Mazare 2023-08-02 17:32:36 +01:00
  • 186c308d51 Wasm llama2 tweaks (#309) Laurent Mazare 2023-08-02 15:49:43 +01:00
  • b2e4beb4f3 Add a prompt. wasm-llama2-tweaks laurent 2023-08-02 15:48:32 +01:00
  • d48bddbe01 Use a proper tokenizer. laurent 2023-08-02 15:30:21 +01:00
  • 145706f8df Clean-up the llama2.c wasm example. laurent 2023-08-02 15:20:03 +01:00
  • 4f17290ce0 Use AdamW in the llama2 training. (#308) Laurent Mazare 2023-08-02 14:14:02 +01:00
  • 0902846f25 Add the AdamW optimizer. (#307) Laurent Mazare 2023-08-02 14:03:49 +01:00
  • e2acbe1e72 Update the wasm example locations in the readme. (#306) Laurent Mazare 2023-08-02 11:36:43 +01:00
  • 4fe8a02f88 Update the repo location. (#305) Laurent Mazare 2023-08-02 11:12:18 +01:00
  • 03a421f714 Add some missing readme files. (#304) Laurent Mazare 2023-08-02 10:57:12 +01:00
  • d38943aadc Add version numbers for all the candle crates (#303) Laurent Mazare 2023-08-02 10:52:13 +01:00
  • 51e51da896 Rename the candle crate to candle-core (#301) Laurent Mazare 2023-08-02 08:20:22 +01:00
  • 6e33ff62d6 Update cudarc now that it includes the cublas-f16 and nccl changes. (#300) Laurent Mazare 2023-08-02 05:54:28 +01:00
  • 4b3bd79fbd Remove the embedding ops in favor of index-select. (#299) Laurent Mazare 2023-08-02 05:42:11 +01:00
  • cc76c63202 Use index-select for the embeddings as it supports backprop. (#298) Laurent Mazare 2023-08-01 20:44:43 +01:00
  • ff876c2103 Llama more training (#297) Laurent Mazare 2023-08-01 19:53:41 +01:00
  • a27239f3d9 Add training for the llama2.c example (#296) Laurent Mazare 2023-08-01 17:23:07 +01:00
  • babee9f011 Merge pull request #259 from LaurentMazare/book_2 Nicolas Patry 2023-08-01 17:26:57 +02:00
  • afb5e24a63 Remove map ownership from save. Nicolas Patry 2023-08-01 17:19:22 +02:00
  • 89d1fd03e5 Adding new surface for savetensors (global load, global save). Nicolas Patry 2023-07-27 15:32:42 +02:00
  • 310094310b Modifying safetensors export to get simple load and save. Nicolas Patry 2023-07-27 15:20:13 +02:00
  • 836ba3e090 Merge pull request #258 from LaurentMazare/start_book Nicolas Patry 2023-08-01 14:59:34 +02:00
  • 091e781977 Grammarly pass. Nicolas Patry 2023-08-01 14:08:13 +02:00
  • 5cead227ef Adressed comments. Nicolas Patry 2023-08-01 14:02:21 +02:00
  • ebd0315623 Typo. Nicolas Patry 2023-07-27 16:36:36 +02:00
  • ad9d8fe400 Complexifying our hello world Nicolas Patry 2023-07-27 16:35:40 +02:00
  • 5bc5716b85 Revert "Making sure the CI actually works" Nicolas Patry 2023-07-27 14:48:32 +02:00
  • ba37de94d4 Making sure the CI actually works Nicolas Patry 2023-07-27 12:45:25 +02:00
  • 6242a1470e Starting the book. Nicolas Patry 2023-07-27 12:41:15 +02:00
  • 75e0448114 Move the weight bits in a separate module. (#295) Laurent Mazare 2023-08-01 10:37:06 +01:00
  • 614f911e9e Add some batcher variants that handle errors. (#294) Laurent Mazare 2023-08-01 09:40:34 +01:00
  • e1e8127f15 Add the batcher. (#293) Laurent Mazare 2023-08-01 09:16:10 +01:00
  • fa98ca0c35 Use subcommands in llama2. (#292) Laurent Mazare 2023-08-01 05:57:41 +01:00
  • 1a07ff8d17 Pre-tokenized evaluation mode for llama2.c. (#291) Laurent Mazare 2023-08-01 05:36:25 +01:00
  • f28558d0b7 Evaluate on the pre-tokenized file. (#290) Laurent Mazare 2023-07-31 21:31:38 +01:00
  • 6b98b66eb3 Remove the end of text tokens. (#289) Laurent Mazare 2023-07-31 20:43:57 +01:00
  • 9ae1f6afee Add an eval mode to llama2-c (#288) Laurent Mazare 2023-07-31 17:22:14 +01:00
  • 1064b9b031 Add the cross-entropy loss. (#287) Laurent Mazare 2023-07-31 14:26:36 +01:00
  • ffeafbfc43 Make the nll op closer to the pytorch version + add a test. (#286) Laurent Mazare 2023-07-31 14:14:01 +01:00
  • b3ea96b62b Add a prompt and support more models in llama2-c. (#285) Laurent Mazare 2023-07-31 13:09:30 +01:00
  • 94a43faaca Use the hub models for llama2.c (#284) Laurent Mazare 2023-07-31 12:51:14 +01:00
  • 62a9b03715 Add a flag to set the number of epochs in the mnist training (#283) Laurent Mazare 2023-07-31 10:32:14 +01:00
  • 67834119fc Fix the flash-attention function names. (#282) Laurent Mazare 2023-07-31 10:04:39 +01:00
  • 0ace420e66 Flash attention without padding (varlen). (#281) Laurent Mazare 2023-07-31 09:45:39 +01:00
  • a8d8f9f206 Load a trained checkpoint in the mnist example. (#280) Laurent Mazare 2023-07-30 17:01:45 +01:00
  • 38ff693af0 Add a flag to save the trained weights. (#279) Laurent Mazare 2023-07-30 15:41:42 +01:00
  • ba2254556c Display the temperature being used for text generation. (#278) Laurent Mazare 2023-07-30 09:53:05 +01:00
  • c950a5c6b1 Cuda support for the mnist training. (#277) Laurent Mazare 2023-07-29 19:48:04 +01:00
  • 16c33383eb Improve the mnist training example. (#276) Laurent Mazare 2023-07-29 16:28:22 +01:00
  • bedcef64dc Merge pull request #262 from LaurentMazare/update_multiprocess Nicolas Patry 2023-07-29 16:40:39 +02:00
  • 40c80bfbb2 Merge branch 'main' into update_multiprocess Nicolas Patry 2023-07-29 16:38:35 +02:00
  • 07eb899729 More mnist training. (#275) Laurent Mazare 2023-07-29 13:29:31 +01:00
  • c0a8ed19eb Support for where-cond on cuda for u8 and u32. (#274) Laurent Mazare 2023-07-29 11:48:58 +01:00
  • 4bf2ebf836 Use u8 tensors for masks. (#273) Laurent Mazare 2023-07-29 11:32:58 +01:00
  • 97d8712ba5 Remove single function. Nicolas Patry 2023-07-28 10:26:41 +00:00
  • 97181a77c0 Making multiprocess require flash-attn. Nicolas Patry 2023-07-28 07:52:24 +00:00
  • 50d8273ae4 Support both llama v1 and llama v2. (#272) Laurent Mazare 2023-07-28 18:40:59 +01:00
  • 7513a5e005 Line-up the llama implementation with the python-transformers one. (#271) Laurent Mazare 2023-07-28 18:31:28 +01:00
  • cb8dd5cd53 Back to using the main branch now that the PR has been merged. (#270) Laurent Mazare 2023-07-28 16:22:44 +01:00
  • a0e47aba98 Fix the revision used in starcoder to use the safetensors PR. (#269) Laurent Mazare 2023-07-28 14:02:31 +01:00
  • fb84ead8f7 Add the starcoder example to the readme. (#268) Laurent Mazare 2023-07-28 13:26:23 +01:00
  • 3eb2bc6d07 Softmax numerical stability. (#267) Laurent Mazare 2023-07-28 13:13:01 +01:00
  • 68eab38de6 Cuda fix for starcoder. (#266) Laurent Mazare 2023-07-28 12:13:41 +01:00
  • 54ccf94472 Merge pull request #265 from LaurentMazare/fix_nccl Nicolas Patry 2023-07-28 11:37:58 +01:00
  • 4002968cf5 Put back `"dep:half" Nicolas Patry 2023-07-28 10:34:21 +00:00
  • be256a6ba6 Fixing. Nicolas Patry 2023-07-28 10:23:05 +00:00
  • d2dea11ef6 Fixing nccl feature. Nicolas Patry 2023-07-28 12:19:20 +02:00
  • 3e89df938c Starcoder fix (#264) Laurent Mazare 2023-07-28 11:17:49 +01:00
  • 6a54ca115e Add some Bigcode model (#260) Laurent Mazare 2023-07-28 09:57:32 +01:00
  • 4f260ef025 Merge pull request #216 from LaurentMazare/llama_multiprocess2 Nicolas Patry 2023-07-28 08:06:13 +01:00
  • 0b97987b21 Merge pull request #261 from LaurentMazare/upgrade_hf_hub Nicolas Patry 2023-07-28 07:03:30 +01:00
  • 8435a99edd Added comment about offsets. Nicolas Patry 2023-07-27 20:11:57 +02:00
  • ca479a873e Upgrading hf-hub to 0.2.0 (Modified API to not pass the Repo around all the time) Nicolas Patry 2023-07-27 20:05:02 +02:00