Commit Graph

1637 Commits

Author SHA1 Message Date
77197379cc More cleanup. 2023-12-15 11:17:05 +01:00
916a8c5464 Revert candle-transformers. 2023-12-15 11:15:21 +01:00
243e83f2b9 Adding a bunch of docs !
Co-authored-by: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com>
2023-12-15 11:03:05 +01:00
cf27868b57 More cleanup. 2023-12-15 01:44:22 +01:00
40c3e1bd5a cleanup. 2023-12-15 01:41:14 +01:00
ece4c69a68 Fixing softmax. 2023-12-15 01:35:08 +01:00
4eeaf205d6 Fix softmax for long sequences (missing barrier). 2023-12-14 19:37:03 +01:00
f419a38e1a Fix use resource. 2023-12-14 16:52:37 +01:00
361f2ad2af Working with merging encoders and using fences. 2023-12-14 16:05:33 +01:00
e60f9b5dfc Speedup ShardedSafeTensors to load Tensors with default hints (#1384)
* Speedup ShardedSafeTensors to load Tensors with default hints

* Tweaks.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
2023-12-14 08:08:56 -06:00
7be982f6f7 Mention phi-2 in the readme. (#1434) 2023-12-14 08:02:27 -06:00
104e196d46 Phi 2 wasm (#1432)
* add phi 2.0 quantized model wasm

* cols

* spell

* bug
2023-12-14 06:04:17 -06:00
5e33c85c8f Quantized version for phi-v2. (#1430)
* Quantized version for phi-v2.

* More quantized support.
2023-12-13 21:16:34 -06:00
2b3a018be7 Support for phi-2. (#1429)
* Support for phi-2.

* Use the v2 naming scheme.
2023-12-13 20:59:29 -06:00
931432ed55 Fixing tests + matmul from MFA 2023-12-13 16:58:36 +01:00
0404a3eb5b Removed MPSMatrix entirely (buggy). 2023-12-13 16:21:48 +01:00
a9d0657432 Better version ? 2023-12-13 12:09:20 +01:00
4cb443d00a Fix the logsumexp test. (#1426) 2023-12-12 10:56:11 -06:00
87dc559817 Lots of updates including some stack of command buffers. 2023-12-12 17:41:56 +01:00
77252ffb82 Add logsumexp function (#1424) 2023-12-12 10:32:17 -06:00
18eb87f25f Upsample grad (#1420)
* encode size of upsample in enum

* working convolution method for limited 2d kernels

* add test for sf 3 interpolation

* add higher dimensional tests, fix to work with multichannel input

* Remove commented out line.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
2023-12-10 08:43:24 +01:00
da0af3cb3e Merge pull request #1408 from jbochi/metal_gelu2
Fix NaN errors for Gelu in Metal
2023-12-09 19:46:36 +01:00
9bd94c1ffa Speed up bert with approx gelu (#1410) 2023-12-06 17:46:37 +01:00
803ac8405b Put back affine strided tests
Co-Authored-By: Ivar Flakstad <69173633+ivarflakstad@users.noreply.github.com>
2023-12-06 17:04:15 +01:00
6e25822d4f Fix gelu for large x 2023-12-06 09:59:44 -05:00
236b820e28 Another prelu bugfix. (#1407) 2023-12-06 09:54:41 +01:00
2648e797c2 Use the proper broadcasting for prelu. (#1406) 2023-12-05 07:09:31 +01:00
b5c283e86f Add the prelu layer. (#1402) 2023-12-03 16:06:09 +00:00
8418154ee0 Add nvcc ccbin support to examples (#1401) 2023-12-03 16:01:16 +00:00
99b7273b03 Add compute cap env support to examples (#1400) 2023-12-03 16:00:24 +00:00
16161145ae Add the leo models to the quantized examples. (#1398) 2023-12-03 12:30:41 +00:00
0738df5290 Add more mentions to SDXL Turbo in the readme. (#1397) 2023-12-03 10:41:21 +00:00
37bf1ed012 Stable Diffusion Turbo Support (#1395)
* Add support for SD Turbo

* Set Leading as default in euler_ancestral discrete

* Use the appropriate default values for n_steps and guidance_scale.

---------

Co-authored-by: Laurent <laurent.mazare@gmail.com>
2023-12-03 08:37:10 +01:00
dd40edfe73 Add Euler Ancestral Discrete Scheduler (#1390)
* Add Euler Ancestral Discrete Scheduler

* Fix a bug of init_noise_sigma generation

* minor fixes

* use partition_point instead of custom bsearch

* Fix some clippy lints.

---------

Co-authored-by: laurent <laurent.mazare@gmail.com>
2023-12-02 19:59:23 +00:00
5aa1a65dab Add quantized Starling, fix open-chat prompt (#1393)
* Add quantized Starling, fix open-chat prompt

* Fix open-chat and starling prompts
2023-12-02 16:47:19 +00:00
2ca086939f Put back affine strided tests 2023-11-30 11:40:39 +01:00
4349ff1fc2 Starting to fix some tests.
Few fixes.

Going back on remote metal-rs.

Reusing a single buffer (for now) to speed things up.

Adding some half kernels.

All tests are panicking instead of random failure.

Putting back f16 index select.

Add erf.

Working version for llama2-c.

Fixes + cache compute_pipeline_state.

BF16 metal fix.

Remove some prints.

new_owned -> new()..to_owned().

Better batched matmul.

Metal operational.

Reuse buffers on our own reference counts.

Tmp gemm.

Revert "Tmp gemm."

This reverts commit c65f68e988.

Interleave committing.

Speeding up copies using blit.

Fmt.

Fmt.

Remove the assert!

Fmt all.

Fixes after big rebase.

Add softmax for half and bfloat + tests

Fixing Llama example + accumulate softmax in float.
2023-11-30 11:30:31 +01:00
7c3cfd1086 Use the llama weight names for the Yi example. (#1381) 2023-11-27 20:42:52 +00:00
e2eb6590ed Merge pull request #1323 from huggingface/metal3
Adding the test scaffolding.
2023-11-27 13:06:01 +01:00
481c45d78d Add a basic implementation for slice-assign. (#1377) 2023-11-26 17:31:22 +00:00
14a2bdc062 Small tweak: remove the macro usage for the range indexing trait. (#1376) 2023-11-26 16:30:59 +00:00
bfa7c8fc01 Implement the module trait directly for QMatMul. (#1372) 2023-11-25 10:09:45 +00:00
762e996ce6 Distibert (#1366)
* add bce with logit loss

* add bce with logit loss

* remove imports

* fix tiny bug

* add test documentation and refactor function

* fix test cases and formatting

* distilbet files

* Apply various cleanups.

* More cleanups.

* More polish.

---------

Co-authored-by: laurent <laurent.mazare@gmail.com>
2023-11-24 15:09:14 +00:00
ca19a9af62 Fix linspace implementation (#1358)
* Fix linspace implementation

`steps` should be strictly greater than 1 to make it consistent with the context.

* Handle steps == 0 and steps == 1.

* Fix rustfmt.

---------

Co-authored-by: laurent <laurent.mazare@gmail.com>
2023-11-23 07:35:13 +00:00
ec23427d60 Ensure to copy data to cpu before iterating. (#1360) 2023-11-23 07:24:25 +00:00
f83e14f68d Add candle-lora transformers to readme? (#1356)
* Demonstrate lora transformers in readme

* Shorten readme
2023-11-21 17:54:24 +00:00
c7e613ab5e Update the readme. (#1354) 2023-11-21 09:38:27 +00:00
8f63f68289 Fix the kalosm link (#1353) 2023-11-21 06:18:14 +01:00
1edc3ddf24 Allowing feature metal to compile. 2023-11-20 20:17:16 +01:00
b380657bfe Merge pull request #1309 from huggingface/metal2
Adding the actual backend
2023-11-20 17:24:01 +01:00