9a62c91643
Proper support for phi-4 ( #2960 )
...
* Add phi-4 support.
* Long-rope support.
* Get clippy to be happy.:
2025-05-21 10:18:33 +02:00
add3a714aa
phi-4-mini ( #2790 )
2025-03-01 10:07:29 +01:00
957d604a78
Enable BF16 on metal. ( #2380 )
2024-08-01 11:05:07 +02:00
74e9e41911
make up for the missing last token output of phi2 example ( #2299 )
2024-06-29 21:34:42 +02:00
77ea479a18
Add Phi-3 Medium ( #2205 )
2024-05-23 13:33:17 +02:00
3b429f3023
Make the dtype configurable for phi. ( #2133 )
2024-04-27 21:32:49 +02:00
cfab6e7616
Mention phi-v3 in the readmes. ( #2122 )
2024-04-24 20:54:24 +02:00
11d4a3c588
Add the phi-3 model. ( #2120 )
...
* Add the phi-3 model.
* Faster rope.
* Bugfix.
* Fix the detokenization.
2024-04-24 09:48:13 +02:00
403680f17d
Quantized GGUF style ( #1523 )
...
* Metal quantized modifications proposal.
- Add a device param, wherever needed.
- Create new QMetal storage thing that implements QuantizedType.
- Update everywhere needed.
Fix Python.
Fixing examples.
Fix: fmt + clippy + stub.
Moving everything around.
Only missing the actual implems.
Fixing everything + adding dequantized kernels.
More work.
Fixing matmul.
Fmt + Clippy
Some clippy fixes.
Working state.
Q2K Metal -> Bugged (also present in GGML).
Q4K CPU -> Bugged (present previously, new test catch it).
Q5K CPU -> Bugged (present previously).
Q8_1 Both -> Never really implemented it seems
Q8K metal -> Never implemented in metal
Fixing Q2K bug (present in ggml).
* Cleanup.
* Fix the rebase.
* Removing the fences speeds everything up and *is* correct this time...
* Cleanup the fence.
* After rebase.
* Bad code removal.
* Rebase after phi2 merge + fix replit default to CPU.
* Making the CI happy.
* More happy tests.
---------
Co-authored-by: Nicolas Patry <nicolas@Nicolass-MacBook-Pro.local >
2024-01-17 10:27:58 +01:00
ea36f3b11f
Use the new phi model by default. ( #1589 )
2024-01-15 12:30:27 +01:00
539ead927a
Update the Phi model to use the updated architecture. ( #1580 )
...
* Update the Phi model to use the updated architecture.
* Add more of the phi model.
* Repeat KV + caching.
* Apply the rotary embeddings.
* Add support for the new phi model in the phi example.
* Fix a couple glitches.
* Fix a couple more glitches.
2024-01-13 17:38:27 +01:00
6242276c09
Pin the revision used for phi-v2 + make it the default. ( #1572 )
...
* Pin the revision used for phi-v2 + make it the default.
* Tweak the custom-ops build.
2024-01-12 09:19:30 +01:00
37c539f2b7
Helper function to load sharded safetensors files ( #1481 )
...
* Fix the quantized mistral example.
* Add a helper function to load sharded safetensors weights.
* Use the sharded loader.
2023-12-25 21:49:21 +01:00
5b35fd0fcf
MMLU evaluation for Phi. ( #1474 )
...
* MMLU evaluation for Phi.
* Improve the evaluation.
2023-12-23 15:28:36 +01:00
c4cfcf1539
Tweak the readme for phi and the default sample length. ( #1450 )
2023-12-16 18:11:36 -06:00
79eab519fd
Fix phi example ( #1436 )
...
* Fix phi example
* Remove the cuda mention.
---------
Co-authored-by: Laurent <laurent.mazare@gmail.com >
2023-12-15 07:01:10 -06:00
5e33c85c8f
Quantized version for phi-v2. ( #1430 )
...
* Quantized version for phi-v2.
* More quantized support.
2023-12-13 21:16:34 -06:00
2b3a018be7
Support for phi-2. ( #1429 )
...
* Support for phi-2.
* Use the v2 naming scheme.
2023-12-13 20:59:29 -06:00
70d06ab4b0
Add support for the phi-hermes finetuned model. ( #1192 )
2023-10-27 05:57:08 +01:00
00948eb656
Formatting tweak. ( #1111 )
2023-10-16 21:02:53 +01:00
af67672207
Add support for Puffin-Phi-v2. ( #1110 )
...
* Add support for Puffin-Phi-v2.
* Tweak the file name.
* Support the config for puffin-phi-v2.
* Update the readme.
2023-10-16 20:54:21 +01:00
588ad4835a
Fix the verbose prompt for phi. ( #1097 )
2023-10-15 10:53:25 +01:00
b73c35cc57
Improve the reshape error messages. ( #1096 )
...
* Improve the reshape error messages.
* Add the verbose-prompt flag to the phi example.
2023-10-15 10:43:10 +01:00
8921d5027c
Add support for phi-1.0 ( #1093 )
...
* Add support for phi-1.0
* Update the readme.
2023-10-14 20:15:43 +01:00
87e3a4e175
Mistral: exit on eos token. ( #1001 )
...
* Mistral: exit on eos token.
* Print the proper stats.
* Also add a short flag.
2023-09-30 07:07:06 +01:00
2dd43d6cdd
add eos token to phi example ( #965 )
...
* add eos token to phi example
* rustfmt + get the token directly.
---------
Co-authored-by: laurent <laurent.mazare@gmail.com >
2023-09-26 09:21:22 +01:00
c78a294323
Add some repeat penalty to the phi example. ( #961 )
2023-09-25 20:53:30 +01:00
1ce7fe2543
Add more examples to the phi readme. ( #956 )
2023-09-24 18:19:05 +01:00
f5069dd354
Use the repo for the quantized phi model. ( #954 )
2023-09-24 16:30:26 +01:00
0007ae9c11
Add the quantized mixformer model. ( #953 )
...
* Add the quantized mixformer model.
* Add the quantized option in the phi example.
2023-09-24 15:03:48 +01:00
bb3471ea31
Adapt more examples to the updated safetensor api. ( #947 )
...
* Simplify the safetensor usage.
* Convert more examples.
* Move more examples.
* Adapt stable-diffusion.
2023-09-23 21:26:03 +01:00
b54acfa3d0
Tracing for the phi model ( #936 )
...
* Add some tracing bits to mixformers.
* Add the missing file.
* Add the conv2d layer to with-tracing.
* Improve the tracing usage.
2023-09-23 09:19:34 +01:00
912a3d63b0
Use the proper block size for quantizing models. ( #933 )
...
* Use the proper block size for quantizing models.
* Use the proper dimension.
2023-09-22 21:36:56 +01:00
df6f5240ba
Complete the mixformer implementation. ( #930 )
...
* Complete the mixformers implementation.
* Tweak the attention.
* Add the phi-1.5 example.
* Improve the phi example.
* Bugfix.
* Get the phi example to work.
2023-09-22 20:03:16 +01:00