Commit Graph

981 Commits

Author SHA1 Message Date
46eb225ba5 Add some missing entries to the changelog. (#606) 2023-08-25 18:01:38 +01:00
aa67e5107d Merge pull request #600 from huggingface/codellama_gpu_support
Adding support for codellama in examples.
2023-08-25 18:25:26 +02:00
c105550405 s/panic/bail/ 2023-08-25 18:05:07 +02:00
ca6c050b04 Cleanup the pose reporting code. (#605) 2023-08-25 16:49:21 +01:00
9c8d6dbc2a Neon intrinsics for the q8_0 vecdot. (#604)
* Neon intrinsics for the q8_0 vecdot.

* Get the tests to run with accelerate (with some numerical error failures).
2023-08-25 14:42:18 +01:00
0afbc435df Add some configurable legend for yolo detection. (#603)
* Add some configurable legend for yolo detection.

* Clippyness.
2023-08-25 13:50:31 +01:00
97909e5068 Move the yolo model bits in a separate file. (#602)
* Move the yolo model bits in a separate file.

* Improve the drawing.

* Bugfix.
2023-08-25 12:47:55 +01:00
8bc5fffa45 More support for pose estimation in yolo-v8. (#599)
* More support for pose estimation in yolo-v8.

* Support both object detection and pose-estimation in the yolo-v8 example.
2023-08-25 11:21:11 +01:00
4826a4212e Adding support for codellama in examples.
Codellama requires bf16 for now (error to convert from bf16 to f16).
Multiprocess demo not functional for it because flash-attn only supports
f16 for now.
2023-08-25 09:56:11 +00:00
afc10a3232 AVX version for the q8-0 multiplications. (#598) 2023-08-25 10:14:49 +01:00
d728e646c2 Use resolver 2 explicitely. (#597) 2023-08-25 09:35:40 +01:00
c093b03d51 Generic implementation of vecdot for q80. (#596)
* Generic implementation of vecdot for q80.

* Add support for code-llama 7b.

* Support more code-llama.
2023-08-25 09:04:05 +01:00
d8ba0452dc Fail on bf16. (#594) 2023-08-25 06:10:38 +01:00
189442a0fa Add the pose estimation head for yolo. (#589)
* Add the pose estimation head for yolo.

* Properly handle the added position dimensions.

* Integrate the pose estimation head in the forward pass.

* Renaming.

* Fix for pose estimation.
2023-08-24 22:12:34 +01:00
2cde0cb74b More pickle support. (#588)
* More pickle support.

* Be more verbose.
2023-08-24 18:45:10 +01:00
e21c686cdc Fixes for clippy 1.72. (#587) 2023-08-24 17:46:17 +01:00
c265ac50fa Add a function to write gguf files. (#585)
* Add a function to write gguf files.

* More GGUF file writing.

* Write the tensor data in GGUF files.
2023-08-24 17:03:06 +01:00
a87c6f7652 Merge pull request #561 from patrickvonplaten/add_installation
Improve installation section and "get started"
2023-08-24 16:25:52 +02:00
afd965f77c More non square testing (#582)
* Add more non square testing.

* More testing.
2023-08-24 13:01:04 +01:00
d2f42ab086 Referenze implementations of q2k and q3k vec-dot functions (#580)
* add `q2k` vec-dot

* `q3k` vec-dot + quantization bugfix
2023-08-24 12:35:54 +01:00
ca318a6ec7 Add to the cuda example a reproduction of the issue. (#579)
* Add to the cuda example a reproduction of the issue.

* Tweak.

* Add a test using non-square matrixes.

* Fix the conv2d kernel.

* Display the error.

* And tweak the comment.
2023-08-24 12:07:31 +01:00
dd64465899 Add a test for conv2d with padding + bugfix the random number generation on cuda. (#578)
* Add a test for conv2d with padding.

* Cosmetic changes.

* Bugfix the rand function on the cuda backend.
2023-08-24 10:16:37 +01:00
79916c2edb Use the hub weights for efficientnet. (#573) 2023-08-23 18:20:21 +01:00
431051cc32 Add Efficientnet (#572)
* EfficientNet.

* Complete the efficientnet implementation.

* Improve group handling.

* Get the efficientnet to work.
2023-08-23 18:02:58 +01:00
eedd85ffa7 Move the imagenet specific bits to a separate file. (#571) 2023-08-23 16:42:09 +01:00
7478dda255 Cosmetic tweaks. (#570) 2023-08-23 15:45:40 +01:00
329f661d9b Trace softmax (#568)
* Trace the softmax op.

* Inline the sum.

* Add min/max vec operations.
2023-08-23 15:25:50 +01:00
075b505480 Mirror GGML's unit tests (#569)
* Add ggml unit tests

* simplify random matmul test for other test cases
2023-08-23 15:25:17 +01:00
aba1e90797 Add some group parameter to convolutions. (#566)
* Add some group parameter to convolutions.

* Avoid some unnecessary groups checks.

* Move the tensor convolution bits.

* Properh handling of groups.

* Bump the crate version.

* And add a changelog.
2023-08-23 12:58:55 +01:00
1f58bdbb1d Apply suggestions from code review 2023-08-23 13:33:45 +02:00
c98d3cfd8b Update candle-book/src/guide/installation.md 2023-08-23 13:31:54 +02:00
c5e43ad0ab Apply suggestions from code review 2023-08-23 13:27:29 +02:00
2c280007e8 Apply suggestions from code review 2023-08-23 13:26:21 +02:00
4ee1cf038a Get the rms epsilon from GGUF. (#565) 2023-08-23 11:40:20 +01:00
0f4ff8a739 Fix the quantized example. (#564) 2023-08-23 11:09:55 +01:00
89a00b56cc add chat models in quantized example (#551)
* add chat models in quantized example

* cargo fmt
2023-08-23 11:05:33 +01:00
9a5c7db91a Add support for i64 (#563)
* Add the i64 dtype.

* Adapt the cuda kernels.
2023-08-23 10:42:19 +01:00
649202024c fix code snippets 2023-08-23 09:05:07 +00:00
283f6c048d fix code snippets 2023-08-23 09:04:36 +00:00
c8211fc474 fix code snippets 2023-08-23 09:04:08 +00:00
7732bf6238 correct 2023-08-23 08:54:48 +00:00
7c0ca80d3a move installation to book 2023-08-23 08:52:53 +00:00
b558d08b85 improve 2023-08-23 08:42:47 +00:00
34cb9f924f improve 2023-08-23 08:40:23 +00:00
d4968295a0 improve 2023-08-23 08:37:08 +00:00
65e146c72d Add installation section 2023-08-23 08:32:59 +00:00
3743bed2d7 Fix the ? operator cannot be applied to type Device of example (#560)
According to the API:

```rust
inp = inp.to_device(&Device::Cuda(0)?)?;
```

cannot work as `Cuda(...)` expects a type `Device` not an integer.

I'd recommend to instead use `new_cuda(...)`
2023-08-23 09:29:50 +01:00
508d34daf2 GGUF support in the quantized model. (#559)
* GGUF support in the quantized model.

* Get the GGUF support to work on llama.
2023-08-23 09:20:57 +01:00
0764741cc4 Handle GGUF files in tensor-tools. (#558) 2023-08-23 06:32:07 +01:00
6a30ecefad Preliminary GGUF support. (#557)
* Preliminary GGUF support.

* Tensor reading.
2023-08-23 00:14:10 +01:00