06b37ea7ad
Avoid using tmp values. ( #609 )
2023-08-26 12:28:28 +01:00
c72eb3d75b
Add reference implementation for q4k
and q5k
( #586 )
...
* add `q2k` vec-dot
* `q3k` vec-dot + quantization bugfix
* `q4k` vec-dot
* `q5k` vec-dot
* Validate against GGML unit test results.
* Remove some more `transmutes`
2023-08-26 12:07:54 +01:00
864227edbf
[WIP] Improve Yolo WASM UI example ( #591 )
...
* return detections with classes names
* ignore .DS_Store
* example how to load wasm module
* add param to set model size
* add param for model size
* accept iou and confidence threshold on run
* conf and iou thresholds
* clamp only
* remove images from branch
* a couple of renamings, add readme with instructions
* final design
* minor font + border update
2023-08-26 11:40:41 +01:00
b23b347b35
Merge pull request #601 from huggingface/repair_bf16_f16_cast
...
Repairing cast bf16/f16
2023-08-26 12:34:41 +02:00
71518caeee
Align tensor device print more with PyTorch ( #590 )
...
* Improve tensor print
* Use CudaDevice only if enabled with cuda feature
* run rust fmt
* up
* improve
* rustfmt
2023-08-26 11:20:22 +01:00
6559eae72c
Avoid some transmutes. ( #607 )
2023-08-25 18:21:37 +01:00
46eb225ba5
Add some missing entries to the changelog. ( #606 )
2023-08-25 18:01:38 +01:00
aa67e5107d
Merge pull request #600 from huggingface/codellama_gpu_support
...
Adding support for codellama in examples.
2023-08-25 18:25:26 +02:00
c105550405
s/panic/bail/
2023-08-25 18:05:07 +02:00
ca6c050b04
Cleanup the pose reporting code. ( #605 )
2023-08-25 16:49:21 +01:00
9c8d6dbc2a
Neon intrinsics for the q8_0 vecdot. ( #604 )
...
* Neon intrinsics for the q8_0 vecdot.
* Get the tests to run with accelerate (with some numerical error failures).
2023-08-25 14:42:18 +01:00
0afbc435df
Add some configurable legend for yolo detection. ( #603 )
...
* Add some configurable legend for yolo detection.
* Clippyness.
2023-08-25 13:50:31 +01:00
d4e75d5825
Let's keep the dirty code on its own.
2023-08-25 12:01:58 +00:00
be371e827c
Intermediary float cast is necessary for cuda 11.8
2023-08-25 11:54:30 +00:00
97909e5068
Move the yolo model bits in a separate file. ( #602 )
...
* Move the yolo model bits in a separate file.
* Improve the drawing.
* Bugfix.
2023-08-25 12:47:55 +01:00
1c1e34735e
static_cast
?
2023-08-25 11:40:36 +00:00
db8bab8b7a
Different casting ?
2023-08-25 10:49:22 +00:00
bc131b402b
Repairing cast bf16/f16
2023-08-25 10:38:19 +00:00
8bc5fffa45
More support for pose estimation in yolo-v8. ( #599 )
...
* More support for pose estimation in yolo-v8.
* Support both object detection and pose-estimation in the yolo-v8 example.
2023-08-25 11:21:11 +01:00
4826a4212e
Adding support for codellama in examples.
...
Codellama requires bf16 for now (error to convert from bf16 to f16).
Multiprocess demo not functional for it because flash-attn only supports
f16 for now.
2023-08-25 09:56:11 +00:00
afc10a3232
AVX version for the q8-0 multiplications. ( #598 )
2023-08-25 10:14:49 +01:00
d728e646c2
Use resolver 2 explicitely. ( #597 )
2023-08-25 09:35:40 +01:00
c093b03d51
Generic implementation of vecdot for q80. ( #596 )
...
* Generic implementation of vecdot for q80.
* Add support for code-llama 7b.
* Support more code-llama.
2023-08-25 09:04:05 +01:00
d8ba0452dc
Fail on bf16. ( #594 )
2023-08-25 06:10:38 +01:00
189442a0fa
Add the pose estimation head for yolo. ( #589 )
...
* Add the pose estimation head for yolo.
* Properly handle the added position dimensions.
* Integrate the pose estimation head in the forward pass.
* Renaming.
* Fix for pose estimation.
2023-08-24 22:12:34 +01:00
2cde0cb74b
More pickle support. ( #588 )
...
* More pickle support.
* Be more verbose.
2023-08-24 18:45:10 +01:00
e21c686cdc
Fixes for clippy 1.72. ( #587 )
2023-08-24 17:46:17 +01:00
c265ac50fa
Add a function to write gguf files. ( #585 )
...
* Add a function to write gguf files.
* More GGUF file writing.
* Write the tensor data in GGUF files.
2023-08-24 17:03:06 +01:00
a87c6f7652
Merge pull request #561 from patrickvonplaten/add_installation
...
Improve installation section and "get started"
2023-08-24 16:25:52 +02:00
afd965f77c
More non square testing ( #582 )
...
* Add more non square testing.
* More testing.
2023-08-24 13:01:04 +01:00
d2f42ab086
Referenze implementations of q2k
and q3k
vec-dot functions ( #580 )
...
* add `q2k` vec-dot
* `q3k` vec-dot + quantization bugfix
2023-08-24 12:35:54 +01:00
ca318a6ec7
Add to the cuda example a reproduction of the issue. ( #579 )
...
* Add to the cuda example a reproduction of the issue.
* Tweak.
* Add a test using non-square matrixes.
* Fix the conv2d kernel.
* Display the error.
* And tweak the comment.
2023-08-24 12:07:31 +01:00
dd64465899
Add a test for conv2d with padding + bugfix the random number generation on cuda. ( #578 )
...
* Add a test for conv2d with padding.
* Cosmetic changes.
* Bugfix the rand function on the cuda backend.
2023-08-24 10:16:37 +01:00
79916c2edb
Use the hub weights for efficientnet. ( #573 )
2023-08-23 18:20:21 +01:00
431051cc32
Add Efficientnet ( #572 )
...
* EfficientNet.
* Complete the efficientnet implementation.
* Improve group handling.
* Get the efficientnet to work.
2023-08-23 18:02:58 +01:00
eedd85ffa7
Move the imagenet specific bits to a separate file. ( #571 )
2023-08-23 16:42:09 +01:00
7478dda255
Cosmetic tweaks. ( #570 )
2023-08-23 15:45:40 +01:00
329f661d9b
Trace softmax ( #568 )
...
* Trace the softmax op.
* Inline the sum.
* Add min/max vec operations.
2023-08-23 15:25:50 +01:00
075b505480
Mirror GGML's unit tests ( #569 )
...
* Add ggml unit tests
* simplify random matmul test for other test cases
2023-08-23 15:25:17 +01:00
aba1e90797
Add some group parameter to convolutions. ( #566 )
...
* Add some group parameter to convolutions.
* Avoid some unnecessary groups checks.
* Move the tensor convolution bits.
* Properh handling of groups.
* Bump the crate version.
* And add a changelog.
2023-08-23 12:58:55 +01:00
1f58bdbb1d
Apply suggestions from code review
2023-08-23 13:33:45 +02:00
c98d3cfd8b
Update candle-book/src/guide/installation.md
2023-08-23 13:31:54 +02:00
c5e43ad0ab
Apply suggestions from code review
2023-08-23 13:27:29 +02:00
2c280007e8
Apply suggestions from code review
2023-08-23 13:26:21 +02:00
4ee1cf038a
Get the rms epsilon from GGUF. ( #565 )
2023-08-23 11:40:20 +01:00
0f4ff8a739
Fix the quantized example. ( #564 )
2023-08-23 11:09:55 +01:00
89a00b56cc
add chat models in quantized example ( #551 )
...
* add chat models in quantized example
* cargo fmt
2023-08-23 11:05:33 +01:00
9a5c7db91a
Add support for i64 ( #563 )
...
* Add the i64 dtype.
* Adapt the cuda kernels.
2023-08-23 10:42:19 +01:00
649202024c
fix code snippets
2023-08-23 09:05:07 +00:00
283f6c048d
fix code snippets
2023-08-23 09:04:36 +00:00