Commit Graph

254 Commits

Author SHA1 Message Date
ca6aa8ff12 Use num-cpus to enable parallelism. 2023-06-27 14:42:26 +01:00
318503cd38 Cache the causal mask in llama. 2023-06-27 12:21:08 +01:00
380d61e990 Fix two cuda bugs (matmul and where_cond). 2023-06-27 11:31:04 +01:00
d7f729fb8f Refactor the hierarchy. 2023-06-27 11:57:27 +02:00