Commit Graph

536 Commits

Author SHA1 Message Date
78871ffe38 Add dtype support. 2023-07-02 20:12:26 +01:00
65e069384c Merge pull request #53 from LaurentMazare/more-pyo3
Add more pyo3 wrapping
2023-07-02 07:50:49 +01:00
d38897461b Add to the example. 2023-07-02 07:37:17 +01:00
5b8c6764b0 Add matmul/where_cond. 2023-07-02 07:34:14 +01:00
9a9858bbe0 Expose a couple more ops. 2023-07-02 07:30:00 +01:00
dfe197f791 Handle more input types to create tensors. 2023-07-02 07:19:46 +01:00
4a28dcf828 Rename the method. 2023-07-02 07:08:11 +01:00
c62cb73a7f Support higher order shapes for conversions. 2023-07-02 07:07:22 +01:00
fa58c7643d Add a trait to avoid repeating the dtype matching. 2023-07-02 06:58:10 +01:00
2596821a08 Merge pull request #52 from LaurentMazare/pyo3
Preliminary python api via pyo3
2023-07-02 06:35:31 +01:00
2370b1675d More pyo3. 2023-07-01 22:15:58 +01:00
86df4ad79c Get shape to return a tuple. 2023-07-01 21:34:38 +01:00
fbbde5b02c Add some binary operators. 2023-07-01 21:27:35 +01:00
42d1a52d01 Add two methods. 2023-07-01 20:55:15 +01:00
52db2a6849 Apply rustfmt. 2023-07-01 20:37:28 +01:00
ebb0fedf14 Very simple pyo3 bindings for candle. 2023-07-01 20:36:44 +01:00
dd879f5b67 Merge pull request #51 from LaurentMazare/custom-prompt
Add a flag for custom prompt.
2023-07-01 06:40:36 +01:00
7c65e2d187 Add a flag for custom prompt. 2023-07-01 06:36:22 +01:00
2c04bff12f Merge pull request #50 from LaurentMazare/rayon1
Do not use rayon for a single thread
2023-06-30 18:56:26 +01:00
bbe0c5fbaa Do not use rayon for a single thread (bis). 2023-06-30 18:47:22 +01:00
6b67d25d9f Do not use rayon for a single thread. 2023-06-30 18:46:32 +01:00
b8b175c01e Merge pull request #49 from LaurentMazare/llama-dtype
Early conversion for the llama weights.
2023-06-30 16:43:56 +01:00
679b6987b6 Early conversion for the llama weights. 2023-06-30 16:42:53 +01:00
dbd7d5b3fd Merge pull request #47 from LaurentMazare/llama-f32
Add a const to easily tweak the dtype used by llama
2023-06-30 15:04:33 +01:00
ed4d0959d3 Add a const to easily tweak the dtype used for llama internal computations. 2023-06-30 15:01:39 +01:00
a243504f53 Merge pull request #46 from LaurentMazare/bugfix-cuda-u8-bf16
Bugfix: remove the u8/bf16 conversion kernel as it is ambiguous.
2023-06-30 10:45:48 +01:00
313fa022a5 Bugfix: remove the u8/bf16 conversion kernel as it is ambiguous. 2023-06-30 10:43:32 +01:00
d2ab4f86bf Merge pull request #45 from LaurentMazare/u8
Add support for u8
2023-06-30 10:35:51 +01:00
fbc329ed85 Add the verbose cpu cast operations. 2023-06-30 10:33:29 +01:00
8ad47907f3 Add the kernels. 2023-06-30 10:26:56 +01:00
a7b16cbb98 Merge pull request #44 from LaurentMazare/check-dim
Improve how we check that the dims are in bounds.
2023-06-30 09:14:45 +01:00
19cbbc5212 Improve how we check that the dims are in bounds. 2023-06-30 09:11:00 +01:00
00476d37f8 Merge pull request #43 from LaurentMazare/bf16
Support for bf16 in cuda kernels
2023-06-30 05:48:58 +01:00
6486a6d7b2 Avoid some cast kernels. 2023-06-29 23:23:44 +01:00
ec79fc43f2 Add the bf16 cuda kernels. 2023-06-29 23:12:02 +01:00
018e017e7e Merge pull request #42 from LaurentMazare/kv-cache-enable
Enable the KV cache
2023-06-29 22:22:11 +01:00
f6152e74b6 Tweak the kv-cache flag. 2023-06-29 22:16:40 +01:00
ae3f202f3b Add a flag. 2023-06-29 22:12:15 +01:00
23389b1bd7 Enable the KV cache after fixing the caching length and the rope bits. 2023-06-29 22:00:57 +01:00
e87a99d16e Merge pull request #41 from LaurentMazare/kv-cache
Kv cache
2023-06-29 19:11:52 +01:00
af66f0829e Revert the new profile. 2023-06-29 19:08:50 +01:00
b50bd880ce Only narrow when needed + deactivate the kv cache. 2023-06-29 19:07:52 +01:00
4b148b5414 Merge pull request #40 from LaurentMazare/fix_kernel_cache
Fixing kernel cache (a bit brutal for now, but if build triggers, rebuild ALL kernels).
2023-06-29 18:02:06 +02:00
1ea08a19cb Rerun on new files. 2023-06-29 15:59:58 +00:00
b5bdbef53a Fixing kernel cache (a bit brutal for now, but if build triggers,
rebuild ALL kernels).
2023-06-29 15:51:08 +00:00
3232df9458 Add some KV cache to llama. 2023-06-29 15:29:40 +01:00
889f7e0971 Merge pull request #39 from LaurentMazare/anyhow-backtrace
Add backtraces.
2023-06-29 13:17:53 +01:00
e27ee98d3f Add backtraces. 2023-06-29 13:17:20 +01:00
e90f4aad26 Merge pull request #38 from LaurentMazare/llama_f16
Moving llama to f16.
2023-06-29 14:12:31 +02:00
78ec40b077 Typo. 2023-06-29 12:09:53 +00:00