baca3cf69d
Fix deps.
2023-08-28 15:15:27 +02:00
d726484a6d
Re-enable local dir for mnist.
2023-08-28 15:15:27 +02:00
dd06d93d0b
Cleanup:
...
- Moved around book from `examples` to `candle-book` proper (overlapping
the book and the lib structures)
2023-08-28 15:15:26 +02:00
c109c93db7
Update candle-book/src/SUMMARY.md
2023-08-28 15:15:02 +02:00
d7a273be51
Training:
...
- Removed a lot of surface (SerializedFileReader ownership is really
painful).
- Moved example + vision to hf.co version.
- Removed feature gate.
2023-08-28 15:15:01 +02:00
dd02f589c0
Better training+hub
2023-08-28 15:14:43 +02:00
7602323667
[Book] Add small error management + start training (with generic dataset
...
inclusion).
2023-08-28 15:14:17 +02:00
9137c63175
Update README.md ( #640 )
2023-08-28 11:34:54 +01:00
3cca89cc70
Add conv-transpose. ( #635 )
...
* Add conv-transpose.
* Return zeros for now.
* Naive CPU implementation.
* Add a conv-transpose test + fix the cpu implementation.
* Add a second test.
2023-08-28 10:10:12 +01:00
26e1b40992
Repeat-penalty in the falcon example. ( #634 )
2023-08-28 08:13:40 +01:00
1da71a5da1
Neon optimized version of the q4k vecdot product. ( #632 )
2023-08-27 21:30:47 +01:00
24dda44c27
Add wasm support for yolo-v8 pose detection. ( #630 )
...
* Add wasm support for yolo-v8 pose detection.
* Better bbox handling.
* Add the pose model in the wasm example lib.
2023-08-27 19:49:24 +01:00
72ebb12bca
Remove some dead-code annotations. ( #629 )
...
* Remove some dead-code annotations.
* More dead code removal.
* One more.
* CI fix.
2023-08-27 18:52:33 +01:00
a3f97c143d
Bump the crate version + update CHANGELOG. ( #628 )
2023-08-27 18:17:11 +01:00
4c338b0cd9
VarBuilder cleanup ( #627 )
...
* VarBuilder cleanup.
* Implement the basic varbuilders.
* Add the sharded code.
* Proper support for tensor sharding.
2023-08-27 18:03:26 +01:00
be471d50ab
Llama quantization. ( #625 )
2023-08-27 14:08:15 +01:00
7151f2cf63
Add the quantize command. ( #624 )
...
* Add the quantize command.
* Bugfix for writing gguf files.
* And add a comment.
2023-08-27 11:35:19 +01:00
6e485f2deb
Add some optional repeat penalty. ( #623 )
...
* Add some optional repeat penalty.
* Add the missing files.
2023-08-27 10:48:45 +01:00
5320aa6b7d
Move the test-utils bits to a shared place. ( #619 )
2023-08-27 09:42:22 +01:00
a8b39dd7b7
Fix for q5_1 quantization. ( #617 )
...
* Fix for q5_1 quantization.
* Fix some typos.
2023-08-27 08:31:18 +01:00
fa0d75b18d
Quantization tests + fix some issues. ( #616 )
2023-08-27 08:17:38 +01:00
28658054ff
More missing quantized bits. ( #615 )
...
* Q4_1 support.
* Add Q5_1 quantization.
* Tweak.
2023-08-27 07:52:26 +01:00
ab36a7f3e3
Fix for when f16c is not available. ( #614 )
2023-08-27 07:19:52 +01:00
f704e39761
Missing quants ops ( #611 )
...
* Another transmute tweak.
* Changelog tweak.
* Add some missing quantized ops.
2023-08-26 20:09:04 +01:00
fdf15f0e05
Another transmute tweak. ( #610 )
...
* Another transmute tweak.
* Changelog tweak.
2023-08-26 13:00:24 +01:00
06b37ea7ad
Avoid using tmp values. ( #609 )
2023-08-26 12:28:28 +01:00
c72eb3d75b
Add reference implementation for q4k
and q5k
( #586 )
...
* add `q2k` vec-dot
* `q3k` vec-dot + quantization bugfix
* `q4k` vec-dot
* `q5k` vec-dot
* Validate against GGML unit test results.
* Remove some more `transmutes`
2023-08-26 12:07:54 +01:00
864227edbf
[WIP] Improve Yolo WASM UI example ( #591 )
...
* return detections with classes names
* ignore .DS_Store
* example how to load wasm module
* add param to set model size
* add param for model size
* accept iou and confidence threshold on run
* conf and iou thresholds
* clamp only
* remove images from branch
* a couple of renamings, add readme with instructions
* final design
* minor font + border update
2023-08-26 11:40:41 +01:00
b23b347b35
Merge pull request #601 from huggingface/repair_bf16_f16_cast
...
Repairing cast bf16/f16
2023-08-26 12:34:41 +02:00
71518caeee
Align tensor device print more with PyTorch ( #590 )
...
* Improve tensor print
* Use CudaDevice only if enabled with cuda feature
* run rust fmt
* up
* improve
* rustfmt
2023-08-26 11:20:22 +01:00
6559eae72c
Avoid some transmutes. ( #607 )
2023-08-25 18:21:37 +01:00
46eb225ba5
Add some missing entries to the changelog. ( #606 )
2023-08-25 18:01:38 +01:00
aa67e5107d
Merge pull request #600 from huggingface/codellama_gpu_support
...
Adding support for codellama in examples.
2023-08-25 18:25:26 +02:00
c105550405
s/panic/bail/
2023-08-25 18:05:07 +02:00
ca6c050b04
Cleanup the pose reporting code. ( #605 )
2023-08-25 16:49:21 +01:00
9c8d6dbc2a
Neon intrinsics for the q8_0 vecdot. ( #604 )
...
* Neon intrinsics for the q8_0 vecdot.
* Get the tests to run with accelerate (with some numerical error failures).
2023-08-25 14:42:18 +01:00
0afbc435df
Add some configurable legend for yolo detection. ( #603 )
...
* Add some configurable legend for yolo detection.
* Clippyness.
2023-08-25 13:50:31 +01:00
d4e75d5825
Let's keep the dirty code on its own.
2023-08-25 12:01:58 +00:00
be371e827c
Intermediary float cast is necessary for cuda 11.8
2023-08-25 11:54:30 +00:00
97909e5068
Move the yolo model bits in a separate file. ( #602 )
...
* Move the yolo model bits in a separate file.
* Improve the drawing.
* Bugfix.
2023-08-25 12:47:55 +01:00
1c1e34735e
static_cast
?
2023-08-25 11:40:36 +00:00
db8bab8b7a
Different casting ?
2023-08-25 10:49:22 +00:00
bc131b402b
Repairing cast bf16/f16
2023-08-25 10:38:19 +00:00
8bc5fffa45
More support for pose estimation in yolo-v8. ( #599 )
...
* More support for pose estimation in yolo-v8.
* Support both object detection and pose-estimation in the yolo-v8 example.
2023-08-25 11:21:11 +01:00
4826a4212e
Adding support for codellama in examples.
...
Codellama requires bf16 for now (error to convert from bf16 to f16).
Multiprocess demo not functional for it because flash-attn only supports
f16 for now.
2023-08-25 09:56:11 +00:00
afc10a3232
AVX version for the q8-0 multiplications. ( #598 )
2023-08-25 10:14:49 +01:00
d728e646c2
Use resolver 2 explicitely. ( #597 )
2023-08-25 09:35:40 +01:00
c093b03d51
Generic implementation of vecdot for q80. ( #596 )
...
* Generic implementation of vecdot for q80.
* Add support for code-llama 7b.
* Support more code-llama.
2023-08-25 09:04:05 +01:00
d8ba0452dc
Fail on bf16. ( #594 )
2023-08-25 06:10:38 +01:00
189442a0fa
Add the pose estimation head for yolo. ( #589 )
...
* Add the pose estimation head for yolo.
* Properly handle the added position dimensions.
* Integrate the pose estimation head in the forward pass.
* Renaming.
* Fix for pose estimation.
2023-08-24 22:12:34 +01:00