candle/src at 07849aa595c65309ed9230a4c97035f471c6afb1 - candle - Gitea: Git with a cup of tea

huggingface/candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-20 12:06:35 +00:00

Files

History

Zack Angelo a2e9d41b20 use softmax_last_dim (metal and cuda kernel) in llama attention layer (#2572 )

2024-10-23 20:07:09 +02:00

..

Include topk sampling in the quantized example. (#2005 )

2024-04-04 09:27:54 +02:00

use softmax_last_dim (metal and cuda kernel) in llama attention layer (#2572 )

2024-10-23 20:07:09 +02:00

Sketch the candle-transformers crate. (#147 )

2023-07-12 13:49:31 +01:00

lib.rs

Move the common quantized-nn code to a shared module. (#1063 )

2023-10-09 06:22:22 +01:00

object_detection.rs

Soft Non-Maximum Suppression (#2400 )

2024-08-10 07:57:52 +02:00

quantized_nn.rs

Use the fast RmsNorm in the quantized model. (#1904 )

2024-03-21 18:49:35 +01:00

quantized_var_builder.rs

Add a quantized version of recurrent-gemma. (#2054 )

2024-04-13 20:07:01 +02:00

utils.rs

Use cat for faster MQA computation. (#2043 )

2024-04-12 09:15:10 +02:00