3cd7e7b51d
Fuse the rel-pos additions via a custom-op. ( #786 )
...
* Fuse the rel-pos additions via a custom-op.
* Run with rayon.
* Add more tracing.
2023-09-09 10:46:09 +01:00
acf8f10ae1
Get the comparison operation to work on scalar values. ( #780 )
...
* Get the comparison operation to work on scalar values.
* Add some time measurement.
2023-09-08 20:13:29 +01:00
0906acab91
Automatic mask generation ( #779 )
...
* A few more contiguous fixes for cuda.
* Mask generation.
* Generic bbox.
* Generate all the masks.
2023-09-08 19:11:34 +01:00
158ff3c609
Add tracing to segment-anything ( #777 )
...
* Tracing support for segment-anything.
* More tracing.
* Handle the empty slice case.
2023-09-08 15:31:29 +01:00
e5703d2f56
Draw the mask on a merged image. ( #775 )
...
* Draw the mask on a merged image.
* Clippy fix.
* Enable the target point by default.
* Add to the readme.
2023-09-08 14:04:34 +01:00
28c87f6a34
Automatic mask generator + point base mask ( #773 )
...
* Add more to the automatic mask generator.
* Add the target point.
* Fix.
* Remove the allow-unused.
* Mask post-processing.
2023-09-08 12:26:56 +01:00
c1453f00b1
Improve the safetensor loading in the segment-anything example. ( #772 )
...
* Improve the safetensor loading in the segment-anything example.
* Properly handle the labels when embedding the point prompts.
2023-09-08 09:39:10 +01:00
989a4807b1
Use shape with holes. ( #771 )
2023-09-08 08:50:27 +01:00
3898e500de
Generate a mask image + the scaled input image. ( #769 )
...
* Also round-trip the original image.
* Make it possible to use a safetensors input.
2023-09-08 05:53:08 +01:00
79c27fc489
Segment-anything fixes: avoid normalizing twice. ( #767 )
...
* Segment-anything fixes: avoid normalizing twice.
* More fixes for the image aspect ratio.
2023-09-07 21:45:16 +01:00
7396b8ed1a
Segment Anything - process images ( #766 )
...
* Start processing images.
* Add LayerNorm2d.
* Properly use LayerNorm2d.
* Tweak eps.
* Use LayerNorm on inputs with a rank different from 3.
* Window partitioning.
* Fix a couple todos.
* More todos.
* Hard-code the einsums.
* More padding support.
* Some sizes tweaks.
* Use the hub to get the weights.
* Use a batch matmul.
* Tweaks.
* More fixes.
* Get some predictions to be generated.
2023-09-07 19:22:45 +01:00
7b50f3e106
More segment-anything again. ( #764 )
...
* More segment-anything again.
* Transformer block forward.
* Two-ways transformer.
* Position embeddings.
* Sketch the prompt encoder.
* More prompt-encoder.
* More prompt-encoder.
* Add the main sam module.
* Embed the transformer.
* And hook the transformer forward step.
* Build the model.
* Handle the global attn indexes.
* Get the model to load.
2023-09-07 12:06:55 +01:00
8c991df394
More segment-anything. ( #763 )
...
* More segment-anything.
* Split the model in multiple files.
* Start adding the transformer.
* Add the attention block.
* Move the MLP Block.
2023-09-07 07:28:30 +01:00
6527ab81a3
Sketch the segment anything model. ( #759 )
...
* Sketch the segment anything model.
* Fix some clippy lint.
* Add the mask decoder.
2023-09-07 05:34:05 +01:00
dcf708559d
Fix for cudnn to work with img2img. ( #753 )
2023-09-06 07:49:28 +01:00
7299a68353
img2img pipeline for stable diffusion. ( #752 )
...
* img2img pipeline for stable diffusion.
* Rename the arguments + fix.
* Fix for zero strength.
* Another fix.
* Another fix.
* Revert.
* Include the backtrace.
* Noise scaling.
* Fix the height/width.
2023-09-06 07:06:49 +01:00
1c9e5394a5
Add a custom softmax implementation. ( #744 )
...
* Add a custom softmax implementation.
* Add softmaxlastdim to the benchmarks.
* And add a test.
* Support more dtypes.
* Polish the code.
* Use the slow implementation on cuda.
* Add a todo for the cuda kernel.
2023-09-05 14:20:23 +01:00
9c61b0fc9b
Proper log buckets for t5. ( #727 )
...
* Proper log buckets for t5.
* Properly pass the position bias.
2023-09-03 20:33:50 +01:00
26cd266e65
Musicgen text embeddings. ( #726 )
...
* Musicgen text embeddings.
* Bugfix for layer norm.
* Proper position bias.
* Expose the weights.
2023-09-03 18:27:48 +01:00
bbec527bb9
Fix the musicgen example. ( #724 )
...
* Fix the musicgen example.
* Retrieve the weights from the hub.
2023-09-03 14:50:39 +01:00
2c1df6bba1
Add a repeat penality to the llama2-c command line example. ( #713 )
...
* Add a repeat penality to the llama2-c command line example.
* Another fix attempt.
2023-09-01 20:38:58 +01:00
19042962d5
Whisper fix ( #711 )
...
* Remove unnecessary file.
* Whisper fix.
2023-09-01 20:04:07 +01:00
7529531056
Add the optimizer trait. ( #702 )
2023-09-01 12:55:39 +01:00
7cef35c84d
Tweak some quantized args ( #692 )
...
* Print the args + change the default temp/repeat penalty.
* Minor formatting tweak.
2023-08-31 17:25:21 +01:00
7509c98970
Interactive mode for the quantized model. ( #690 )
2023-08-31 10:52:42 +01:00
9874d843f1
Fix the accelerate build ( #678 )
...
* Cosmetic changes.
* Fix the accelerate build for tanh.
2023-08-30 18:31:14 +02:00
7d753d3acd
Mnist training dropout ( #677 )
...
* Use dropout in the mnist training.
* Fix.
2023-08-30 16:41:01 +01:00
618f4e4c78
Add some documentation. ( #673 )
...
* Add some documentation.
* Bump the crate version.
2023-08-30 11:54:00 +01:00
a1a5ab8b0a
Neon optimized vecdot ( #666 )
...
* Q5k vecdot.
* Add the q3k vecdot.
* Q2k vecdot.
* Move the quantized model to its own file.
2023-08-29 22:28:46 +01:00
2d3fcad267
Simplify usage of the pool functions. ( #662 )
...
* Simplify usage of the pool functions.
* Small tweak.
* Attempt at using apply to simplify the convnet definition.
2023-08-29 19:12:16 +01:00
b31d41e26a
Add a convnet training example. ( #661 )
...
* Add a convnet example.
* Dataset fix.
* Randomize batches.
2023-08-29 18:23:01 +01:00
a044907ffc
Dilated convolutions ( #657 )
...
* Add the dilation parameter.
* Restore the basic optimizer example.
* Dilation support in cudnn.
* Use the dilation parameter in the cpu backend.
* More dilation support.
* No support for dilation in transposed convolutions.
* Add dilation to a test.
* Remove a print.
* Helper function.
2023-08-29 16:12:11 +01:00
1aca6fa291
Upgrading hf-hub.
2023-08-29 14:18:54 +02:00
14b4d456e8
Merge pull request #439 from huggingface/training_hub_dataset
...
[Book] Add small error management + start training (with generic dataset inclusion).
2023-08-29 13:10:05 +02:00
62ef494dc1
Use multiple transformer layer in the same cross-attn blocks. ( #653 )
...
* Use multiple transformer layer in the same cross-attn blocks.
* Make the context contiguous if required.
2023-08-29 11:13:43 +01:00
33c23c19b6
Preliminary support for SDXL. ( #647 )
...
* Preliminary support for SDXL.
* More SDXL support.
* More SDXL.
* Use the proper clip config.
* Querying for existing tensors.
* More robust test.
2023-08-29 09:00:04 +01:00
d726484a6d
Re-enable local dir for mnist.
2023-08-28 15:15:27 +02:00
d7a273be51
Training:
...
- Removed a lot of surface (SerializedFileReader ownership is really
painful).
- Moved example + vision to hf.co version.
- Removed feature gate.
2023-08-28 15:15:01 +02:00
26e1b40992
Repeat-penalty in the falcon example. ( #634 )
2023-08-28 08:13:40 +01:00
72ebb12bca
Remove some dead-code annotations. ( #629 )
...
* Remove some dead-code annotations.
* More dead code removal.
* One more.
* CI fix.
2023-08-27 18:52:33 +01:00
4c338b0cd9
VarBuilder cleanup ( #627 )
...
* VarBuilder cleanup.
* Implement the basic varbuilders.
* Add the sharded code.
* Proper support for tensor sharding.
2023-08-27 18:03:26 +01:00
6e485f2deb
Add some optional repeat penalty. ( #623 )
...
* Add some optional repeat penalty.
* Add the missing files.
2023-08-27 10:48:45 +01:00
aa67e5107d
Merge pull request #600 from huggingface/codellama_gpu_support
...
Adding support for codellama in examples.
2023-08-25 18:25:26 +02:00
c105550405
s/panic/bail/
2023-08-25 18:05:07 +02:00
ca6c050b04
Cleanup the pose reporting code. ( #605 )
2023-08-25 16:49:21 +01:00
0afbc435df
Add some configurable legend for yolo detection. ( #603 )
...
* Add some configurable legend for yolo detection.
* Clippyness.
2023-08-25 13:50:31 +01:00
97909e5068
Move the yolo model bits in a separate file. ( #602 )
...
* Move the yolo model bits in a separate file.
* Improve the drawing.
* Bugfix.
2023-08-25 12:47:55 +01:00
8bc5fffa45
More support for pose estimation in yolo-v8. ( #599 )
...
* More support for pose estimation in yolo-v8.
* Support both object detection and pose-estimation in the yolo-v8 example.
2023-08-25 11:21:11 +01:00
4826a4212e
Adding support for codellama in examples.
...
Codellama requires bf16 for now (error to convert from bf16 to f16).
Multiprocess demo not functional for it because flash-attn only supports
f16 for now.
2023-08-25 09:56:11 +00:00
c093b03d51
Generic implementation of vecdot for q80. ( #596 )
...
* Generic implementation of vecdot for q80.
* Add support for code-llama 7b.
* Support more code-llama.
2023-08-25 09:04:05 +01:00