candle

mirror of https://github.com/huggingface/candle.git synced 2025-06-16 10:38:54 +00:00

Files

Laurent Mazare 3071134788 Get the ggml based llama to generate some text. (#464 )

* Add more stats to the ggml example.

* Build a quantized model from the file content.

* Move the tensor retrieval in the main crate.

* Start adding the forward pass.

* Add more to the forward pass of the quantized llama.

* Apply the attention layers.

* Add the sampling loop.

* Get the sampling loop to work.

* Minor tweak.

* Add a quantize/dequantize test.

* Bugfix.

* Add a comment + swap the order.

* Bugfixes.

2023-08-16 12:41:07 +01:00

main.rs

Using the real config from the hub when available.

2023-08-16 10:36:01 +02:00

model.rs

Get the ggml based llama to generate some text. (#464 )

2023-08-16 12:41:07 +01:00