mirror of https://github.com/huggingface/candle.git synced 2025-06-15 18:28:24 +00:00

Files

Kyle Birnbaum 1fdfb58de5 Updating Add qwen3 (PR 2903) to use HF weights (#2930 )

* add Qwen3.rs

* fixed compile error

* attempting to gett pr 2903 working with qwen weights

* different qwen variants working

* added moe model

* clippy

* added additional eos token

* translated Korean comments to English as well as I can

* removed specialized Qwen3RmsNorm and replaced with generic Candle RmsNorm

* replaced custom repeat_kv implementation with candle's repeat_kv implementation

* replace linear with linear_b in attention initalization

* replaced custom custom kv_cache implementation with candle kv_cache

* style

* replaced explicit broadcast add with normal add in decoder layer

* removed keeping the Rotary embedding layer in the model struct

* used tie_word_embeddings bool from config instead of relying on existence of weights for lm head in CasualLM

* removed duplicate code from qwen3_moe

* removed sliding window from qwen3 attention

* removed MoE code

* removed unused option

* Fixed Typo

Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>

* fixed tie word embeddings to use the correct embedding weights instead of the opposite

---------

Co-authored-by: Max <naturale@hufs.ac.kr>
Co-authored-by: Laurent Mazare <laurent.mazare@gmail.com>

2025-05-02 06:05:53 +02:00

main.rs

Updating Add qwen3 (PR 2903) to use HF weights (#2930 )

2025-05-02 06:05:53 +02:00

README.md

Readme fix. (#1961 )

2024-03-28 23:24:46 +01:00

README.md

candle-qwen: large language model series from Alibaba Cloud

Qwen 1.5 is a series of large language models that provide strong performances on English and Chinese.

Blog post introducing Qwen1.5.
Model card on the HuggingFace Hub.
Blog post for the mixture-of-experts (MoE) variant.

Running the example

$ cargo run --example qwen --release  -- --prompt "Hello there "

Various model sizes are available via the --model argument, including the MoE variant.

$ cargo run --example qwen --release  -- --model moe-a2.7b --prompt 'def print_prime(n: int): '
def print_prime(n: int):  # n is the number of primes to be printed
    for i in range(2, n + 1):
        if all(i % j != 0 for j in range(2, i)):
            print(i)