d728e646c2
Use resolver 2 explicitely. ( #597 )
2023-08-25 09:35:40 +01:00
c093b03d51
Generic implementation of vecdot for q80. ( #596 )
...
* Generic implementation of vecdot for q80.
* Add support for code-llama 7b.
* Support more code-llama.
2023-08-25 09:04:05 +01:00
d8ba0452dc
Fail on bf16. ( #594 )
2023-08-25 06:10:38 +01:00
189442a0fa
Add the pose estimation head for yolo. ( #589 )
...
* Add the pose estimation head for yolo.
* Properly handle the added position dimensions.
* Integrate the pose estimation head in the forward pass.
* Renaming.
* Fix for pose estimation.
2023-08-24 22:12:34 +01:00
2cde0cb74b
More pickle support. ( #588 )
...
* More pickle support.
* Be more verbose.
2023-08-24 18:45:10 +01:00
e21c686cdc
Fixes for clippy 1.72. ( #587 )
2023-08-24 17:46:17 +01:00
c265ac50fa
Add a function to write gguf files. ( #585 )
...
* Add a function to write gguf files.
* More GGUF file writing.
* Write the tensor data in GGUF files.
2023-08-24 17:03:06 +01:00
a87c6f7652
Merge pull request #561 from patrickvonplaten/add_installation
...
Improve installation section and "get started"
2023-08-24 16:25:52 +02:00
afd965f77c
More non square testing ( #582 )
...
* Add more non square testing.
* More testing.
2023-08-24 13:01:04 +01:00
d2f42ab086
Referenze implementations of q2k
and q3k
vec-dot functions ( #580 )
...
* add `q2k` vec-dot
* `q3k` vec-dot + quantization bugfix
2023-08-24 12:35:54 +01:00
ca318a6ec7
Add to the cuda example a reproduction of the issue. ( #579 )
...
* Add to the cuda example a reproduction of the issue.
* Tweak.
* Add a test using non-square matrixes.
* Fix the conv2d kernel.
* Display the error.
* And tweak the comment.
2023-08-24 12:07:31 +01:00
dd64465899
Add a test for conv2d with padding + bugfix the random number generation on cuda. ( #578 )
...
* Add a test for conv2d with padding.
* Cosmetic changes.
* Bugfix the rand function on the cuda backend.
2023-08-24 10:16:37 +01:00
79916c2edb
Use the hub weights for efficientnet. ( #573 )
2023-08-23 18:20:21 +01:00
431051cc32
Add Efficientnet ( #572 )
...
* EfficientNet.
* Complete the efficientnet implementation.
* Improve group handling.
* Get the efficientnet to work.
2023-08-23 18:02:58 +01:00
eedd85ffa7
Move the imagenet specific bits to a separate file. ( #571 )
2023-08-23 16:42:09 +01:00
7478dda255
Cosmetic tweaks. ( #570 )
2023-08-23 15:45:40 +01:00
329f661d9b
Trace softmax ( #568 )
...
* Trace the softmax op.
* Inline the sum.
* Add min/max vec operations.
2023-08-23 15:25:50 +01:00
075b505480
Mirror GGML's unit tests ( #569 )
...
* Add ggml unit tests
* simplify random matmul test for other test cases
2023-08-23 15:25:17 +01:00
aba1e90797
Add some group parameter to convolutions. ( #566 )
...
* Add some group parameter to convolutions.
* Avoid some unnecessary groups checks.
* Move the tensor convolution bits.
* Properh handling of groups.
* Bump the crate version.
* And add a changelog.
2023-08-23 12:58:55 +01:00
1f58bdbb1d
Apply suggestions from code review
2023-08-23 13:33:45 +02:00
c98d3cfd8b
Update candle-book/src/guide/installation.md
2023-08-23 13:31:54 +02:00
c5e43ad0ab
Apply suggestions from code review
2023-08-23 13:27:29 +02:00
2c280007e8
Apply suggestions from code review
2023-08-23 13:26:21 +02:00
4ee1cf038a
Get the rms epsilon from GGUF. ( #565 )
2023-08-23 11:40:20 +01:00
0f4ff8a739
Fix the quantized example. ( #564 )
2023-08-23 11:09:55 +01:00
89a00b56cc
add chat models in quantized example ( #551 )
...
* add chat models in quantized example
* cargo fmt
2023-08-23 11:05:33 +01:00
9a5c7db91a
Add support for i64 ( #563 )
...
* Add the i64 dtype.
* Adapt the cuda kernels.
2023-08-23 10:42:19 +01:00
649202024c
fix code snippets
2023-08-23 09:05:07 +00:00
283f6c048d
fix code snippets
2023-08-23 09:04:36 +00:00
c8211fc474
fix code snippets
2023-08-23 09:04:08 +00:00
7732bf6238
correct
2023-08-23 08:54:48 +00:00
7c0ca80d3a
move installation to book
2023-08-23 08:52:53 +00:00
b558d08b85
improve
2023-08-23 08:42:47 +00:00
34cb9f924f
improve
2023-08-23 08:40:23 +00:00
d4968295a0
improve
2023-08-23 08:37:08 +00:00
65e146c72d
Add installation section
2023-08-23 08:32:59 +00:00
3743bed2d7
Fix the ?
operator cannot be applied to type Device
of example ( #560 )
...
According to the API:
```rust
inp = inp.to_device(&Device::Cuda(0)?)?;
```
cannot work as `Cuda(...)` expects a type `Device` not an integer.
I'd recommend to instead use `new_cuda(...)`
2023-08-23 09:29:50 +01:00
508d34daf2
GGUF support in the quantized model. ( #559 )
...
* GGUF support in the quantized model.
* Get the GGUF support to work on llama.
2023-08-23 09:20:57 +01:00
0764741cc4
Handle GGUF files in tensor-tools. ( #558 )
2023-08-23 06:32:07 +01:00
6a30ecefad
Preliminary GGUF support. ( #557 )
...
* Preliminary GGUF support.
* Tensor reading.
2023-08-23 00:14:10 +01:00
7687a0f453
Also fix the aspect ratio in the wasm example. ( #556 )
...
* Also fix the aspect ratio in the wasm example.
* Add the yolo lib.
* Update the build script.
2023-08-22 22:20:08 +01:00
f9ecc84477
GQA support in the quantized model. ( #555 )
...
* GQA support in the quantized model.
* Fix the reshaping.
* Fix the main llama model.
* Infer the proper gqa from the model kind.
2023-08-22 19:41:10 +01:00
07067b01dc
Avoid some mutable variables (take 2). ( #554 )
...
* Avoid some mutable variables (take 2).
* Fix.
2023-08-22 18:51:20 +01:00
cc22d4db20
Put the transcribe token before the language one. ( #553 )
2023-08-22 16:46:34 +01:00
ec665acad7
Revert "Avoid some mut in quantized functions. ( #550 )" ( #552 )
...
This reverts commit cf27b9b636
.
2023-08-22 15:57:46 +01:00
cf27b9b636
Avoid some mut in quantized functions. ( #550 )
...
* Avoid a couple more 'let mut'.
* Tweaks.
2023-08-22 15:44:26 +01:00
352383cbc3
Add quantization support for q2k
, q3k
, q4k
and q5k
( #524 )
...
* first q2 implementation
* First Q4K and Q5K implementations
* fix `q2k` and `q5k`
* Some first cleanups
* run `clippy` on tests
* finally implement `q3k`
* deactivate `q3k` test on macos
* also disable the test on linux
* Fix floating bits in `q3k` dequantization
* Refactoring pass + reorder quants in file
* `fmt`
* Re-add `src` asserts and redefine `dst`
2023-08-22 15:04:55 +01:00
9bc811a247
Improve the aspect ratio handling on yolo-v8. ( #549 )
...
* Fix the aspect ratio handling in yolo-v8.
* Typo.
2023-08-22 14:55:33 +01:00
bb69d89e28
Move the yolo shared bits to a common place. ( #548 )
...
* Move the yolo shared bits to a common place.
* Share more code.
* Configurable thresholds.
2023-08-22 13:03:07 +01:00
20ce3e9f39
Sketch the yolo wasm example. ( #546 )
...
* Sketch the yolo wasm example.
* Web ui.
* Get the web ui to work.
* UI tweaks.
* More UI tweaks.
* Use the natural width/height.
* Add a link to the hf space in the readme.
2023-08-22 11:56:43 +01:00