|
af955f260c
|
Make the falcon model cloneable. (#2067)
|
2024-04-15 09:39:03 +02:00 |
|
|
8ad822a983
|
Add a function to clear the KV cache in falcon. (#2066)
* Add a function to clear the KV cache in falcon.
* Clippy.
|
2024-04-15 09:29:25 +02:00 |
|
|
b81ecf712d
|
Support alternative dtypes for mamba (#2036)
* Allow different dtypes in mamba.
* Add a dtype flag.
|
2024-04-10 18:10:01 +02:00 |
|
|
fb918a23c8
|
first commit (#1994)
|
2024-04-02 16:31:05 +02:00 |
|
|
d3a8d291d5
|
Avoid the attention mask where possible. (#1933)
|
2024-03-25 15:31:04 +01:00 |
|
|
c753f72c85
|
Support for attention bias in gemma + refactor things a bit. (#1744)
* Support for attention bias in gemma + refactor things a bit.
* Fix the cuda tests.
|
2024-02-22 09:35:28 +01:00 |
|
|
63944714f2
|
Use candle_nn::embedding instead of local copies in a few models. (#1562)
|
2024-01-10 21:36:27 +01:00 |
|
|
d3f05eae8c
|
Move some models to candle-transformers so that it's easier to re-use. (#794)
* Move some models to candle-transformers so that they can be shared.
* Also move falcon.
* Move Llama.
* Move whisper (partial).
|
2023-09-10 09:40:27 +01:00 |
|