708e422456
Qwen MoE model. ( #1960 )
...
* Qwen MoE model.
* Add the MoE model to the example.
* Fix the scaling.
* Readme updates.
* Readme tweaks.
2024-03-28 23:10:57 +01:00
74497e6bf7
Fixing the qwen tokenizer location. ( #1693 )
...
Using the chatglm one causes a bug where the "<|endoftext|>" is not
found.
2024-02-11 08:52:36 +01:00
1c8d61f051
ChatGLM custom tokenizer. ( #1687 )
2024-02-10 10:47:04 +01:00
40ce16001b
Use the proper endoftext token for gwen. ( #1685 )
2024-02-09 17:02:03 +01:00
5657e596cd
Add the Qwen2 model ( #1684 )
...
* Initial check-in for the qwen2 model.
* More qwen2 inference.
* Polish the qwen example.
* Fix the rope basis.
* Get the inference to work.
* Support different model sizes.
2024-02-09 15:02:49 +01:00