e21c686cdc
Fixes for clippy 1.72. ( #587 )
2023-08-24 17:46:17 +01:00
c265ac50fa
Add a function to write gguf files. ( #585 )
...
* Add a function to write gguf files.
* More GGUF file writing.
* Write the tensor data in GGUF files.
2023-08-24 17:03:06 +01:00
a87c6f7652
Merge pull request #561 from patrickvonplaten/add_installation
...
Improve installation section and "get started"
2023-08-24 16:25:52 +02:00
afd965f77c
More non square testing ( #582 )
...
* Add more non square testing.
* More testing.
2023-08-24 13:01:04 +01:00
d2f42ab086
Referenze implementations of q2k
and q3k
vec-dot functions ( #580 )
...
* add `q2k` vec-dot
* `q3k` vec-dot + quantization bugfix
2023-08-24 12:35:54 +01:00
ca318a6ec7
Add to the cuda example a reproduction of the issue. ( #579 )
...
* Add to the cuda example a reproduction of the issue.
* Tweak.
* Add a test using non-square matrixes.
* Fix the conv2d kernel.
* Display the error.
* And tweak the comment.
2023-08-24 12:07:31 +01:00
dd64465899
Add a test for conv2d with padding + bugfix the random number generation on cuda. ( #578 )
...
* Add a test for conv2d with padding.
* Cosmetic changes.
* Bugfix the rand function on the cuda backend.
2023-08-24 10:16:37 +01:00
79916c2edb
Use the hub weights for efficientnet. ( #573 )
2023-08-23 18:20:21 +01:00
431051cc32
Add Efficientnet ( #572 )
...
* EfficientNet.
* Complete the efficientnet implementation.
* Improve group handling.
* Get the efficientnet to work.
2023-08-23 18:02:58 +01:00
eedd85ffa7
Move the imagenet specific bits to a separate file. ( #571 )
2023-08-23 16:42:09 +01:00
7478dda255
Cosmetic tweaks. ( #570 )
2023-08-23 15:45:40 +01:00
329f661d9b
Trace softmax ( #568 )
...
* Trace the softmax op.
* Inline the sum.
* Add min/max vec operations.
2023-08-23 15:25:50 +01:00
075b505480
Mirror GGML's unit tests ( #569 )
...
* Add ggml unit tests
* simplify random matmul test for other test cases
2023-08-23 15:25:17 +01:00
aba1e90797
Add some group parameter to convolutions. ( #566 )
...
* Add some group parameter to convolutions.
* Avoid some unnecessary groups checks.
* Move the tensor convolution bits.
* Properh handling of groups.
* Bump the crate version.
* And add a changelog.
2023-08-23 12:58:55 +01:00
1f58bdbb1d
Apply suggestions from code review
2023-08-23 13:33:45 +02:00
c98d3cfd8b
Update candle-book/src/guide/installation.md
2023-08-23 13:31:54 +02:00
c5e43ad0ab
Apply suggestions from code review
2023-08-23 13:27:29 +02:00
2c280007e8
Apply suggestions from code review
2023-08-23 13:26:21 +02:00
4ee1cf038a
Get the rms epsilon from GGUF. ( #565 )
2023-08-23 11:40:20 +01:00
0f4ff8a739
Fix the quantized example. ( #564 )
2023-08-23 11:09:55 +01:00
89a00b56cc
add chat models in quantized example ( #551 )
...
* add chat models in quantized example
* cargo fmt
2023-08-23 11:05:33 +01:00
9a5c7db91a
Add support for i64 ( #563 )
...
* Add the i64 dtype.
* Adapt the cuda kernels.
2023-08-23 10:42:19 +01:00
649202024c
fix code snippets
2023-08-23 09:05:07 +00:00
283f6c048d
fix code snippets
2023-08-23 09:04:36 +00:00
c8211fc474
fix code snippets
2023-08-23 09:04:08 +00:00
7732bf6238
correct
2023-08-23 08:54:48 +00:00
7c0ca80d3a
move installation to book
2023-08-23 08:52:53 +00:00
b558d08b85
improve
2023-08-23 08:42:47 +00:00
34cb9f924f
improve
2023-08-23 08:40:23 +00:00
d4968295a0
improve
2023-08-23 08:37:08 +00:00
65e146c72d
Add installation section
2023-08-23 08:32:59 +00:00
3743bed2d7
Fix the ?
operator cannot be applied to type Device
of example ( #560 )
...
According to the API:
```rust
inp = inp.to_device(&Device::Cuda(0)?)?;
```
cannot work as `Cuda(...)` expects a type `Device` not an integer.
I'd recommend to instead use `new_cuda(...)`
2023-08-23 09:29:50 +01:00
508d34daf2
GGUF support in the quantized model. ( #559 )
...
* GGUF support in the quantized model.
* Get the GGUF support to work on llama.
2023-08-23 09:20:57 +01:00
0764741cc4
Handle GGUF files in tensor-tools. ( #558 )
2023-08-23 06:32:07 +01:00
6a30ecefad
Preliminary GGUF support. ( #557 )
...
* Preliminary GGUF support.
* Tensor reading.
2023-08-23 00:14:10 +01:00
7687a0f453
Also fix the aspect ratio in the wasm example. ( #556 )
...
* Also fix the aspect ratio in the wasm example.
* Add the yolo lib.
* Update the build script.
2023-08-22 22:20:08 +01:00
f9ecc84477
GQA support in the quantized model. ( #555 )
...
* GQA support in the quantized model.
* Fix the reshaping.
* Fix the main llama model.
* Infer the proper gqa from the model kind.
2023-08-22 19:41:10 +01:00
07067b01dc
Avoid some mutable variables (take 2). ( #554 )
...
* Avoid some mutable variables (take 2).
* Fix.
2023-08-22 18:51:20 +01:00
cc22d4db20
Put the transcribe token before the language one. ( #553 )
2023-08-22 16:46:34 +01:00
ec665acad7
Revert "Avoid some mut in quantized functions. ( #550 )" ( #552 )
...
This reverts commit cf27b9b636
.
2023-08-22 15:57:46 +01:00
cf27b9b636
Avoid some mut in quantized functions. ( #550 )
...
* Avoid a couple more 'let mut'.
* Tweaks.
2023-08-22 15:44:26 +01:00
352383cbc3
Add quantization support for q2k
, q3k
, q4k
and q5k
( #524 )
...
* first q2 implementation
* First Q4K and Q5K implementations
* fix `q2k` and `q5k`
* Some first cleanups
* run `clippy` on tests
* finally implement `q3k`
* deactivate `q3k` test on macos
* also disable the test on linux
* Fix floating bits in `q3k` dequantization
* Refactoring pass + reorder quants in file
* `fmt`
* Re-add `src` asserts and redefine `dst`
2023-08-22 15:04:55 +01:00
9bc811a247
Improve the aspect ratio handling on yolo-v8. ( #549 )
...
* Fix the aspect ratio handling in yolo-v8.
* Typo.
2023-08-22 14:55:33 +01:00
bb69d89e28
Move the yolo shared bits to a common place. ( #548 )
...
* Move the yolo shared bits to a common place.
* Share more code.
* Configurable thresholds.
2023-08-22 13:03:07 +01:00
20ce3e9f39
Sketch the yolo wasm example. ( #546 )
...
* Sketch the yolo wasm example.
* Web ui.
* Get the web ui to work.
* UI tweaks.
* More UI tweaks.
* Use the natural width/height.
* Add a link to the hf space in the readme.
2023-08-22 11:56:43 +01:00
44420d8ae1
Add some llama-v2 variants. ( #545 )
2023-08-22 08:35:15 +01:00
f16bb97401
Use the yolo-v8 weights from the hub. ( #544 )
...
* Use the weights from the hub.
* Add to the readme.
2023-08-21 22:07:36 +01:00
3507e14c0c
Yolo v8 fixes ( #542 )
...
* Fixes for the yolo-v8 layout.
* Bugfixes.
* Another silly bugfix.
* Remove the hf-hub dependency.
* Remove the transformers dependency.
2023-08-21 21:05:40 +01:00
de50e66af1
Add yolo v8 as an example ( #541 )
...
* Sketching yolo-v8.
* Get the model to load.
* yolo-v8 forward pass.
* Complete(?) the forward pass.
* Fix some shape issues.
* Add the missing padding.
* Process the predictions.
2023-08-21 18:40:09 +01:00
cc2d6cf2e0
Improve the timestamps support in whisper ( #539 )
...
* Timestamp support for whisper.
* Properly display the timestamps.
* Bugfix for the timestamp units.
2023-08-21 12:26:59 +01:00