Commit Graph

308 Commits

Author SHA1 Message Date
79c27fc489 Segment-anything fixes: avoid normalizing twice. (#767)
* Segment-anything fixes: avoid normalizing twice.

* More fixes for the image aspect ratio.
2023-09-07 21:45:16 +01:00
7396b8ed1a Segment Anything - process images (#766)
* Start processing images.

* Add LayerNorm2d.

* Properly use LayerNorm2d.

* Tweak eps.

* Use LayerNorm on inputs with a rank different from 3.

* Window partitioning.

* Fix a couple todos.

* More todos.

* Hard-code the einsums.

* More padding support.

* Some sizes tweaks.

* Use the hub to get the weights.

* Use a batch matmul.

* Tweaks.

* More fixes.

* Get some predictions to be generated.
2023-09-07 19:22:45 +01:00
7b50f3e106 More segment-anything again. (#764)
* More segment-anything again.

* Transformer block forward.

* Two-ways transformer.

* Position embeddings.

* Sketch the prompt encoder.

* More prompt-encoder.

* More prompt-encoder.

* Add the main sam module.

* Embed the transformer.

* And hook the transformer forward step.

* Build the model.

* Handle the global attn indexes.

* Get the model to load.
2023-09-07 12:06:55 +01:00
8c991df394 More segment-anything. (#763)
* More segment-anything.

* Split the model in multiple files.

* Start adding the transformer.

* Add the attention block.

* Move the MLP Block.
2023-09-07 07:28:30 +01:00
6527ab81a3 Sketch the segment anything model. (#759)
* Sketch the segment anything model.

* Fix some clippy lint.

* Add the mask decoder.
2023-09-07 05:34:05 +01:00
dcf708559d Fix for cudnn to work with img2img. (#753) 2023-09-06 07:49:28 +01:00
7299a68353 img2img pipeline for stable diffusion. (#752)
* img2img pipeline for stable diffusion.

* Rename the arguments + fix.

* Fix for zero strength.

* Another fix.

* Another fix.

* Revert.

* Include the backtrace.

* Noise scaling.

* Fix the height/width.
2023-09-06 07:06:49 +01:00
1c9e5394a5 Add a custom softmax implementation. (#744)
* Add a custom softmax implementation.

* Add softmaxlastdim to the benchmarks.

* And add a test.

* Support more dtypes.

* Polish the code.

* Use the slow implementation on cuda.

* Add a todo for the cuda kernel.
2023-09-05 14:20:23 +01:00
9c61b0fc9b Proper log buckets for t5. (#727)
* Proper log buckets for t5.

* Properly pass the position bias.
2023-09-03 20:33:50 +01:00
26cd266e65 Musicgen text embeddings. (#726)
* Musicgen text embeddings.

* Bugfix for layer norm.

* Proper position bias.

* Expose the weights.
2023-09-03 18:27:48 +01:00
bbec527bb9 Fix the musicgen example. (#724)
* Fix the musicgen example.

* Retrieve the weights from the hub.
2023-09-03 14:50:39 +01:00
2c1df6bba1 Add a repeat penality to the llama2-c command line example. (#713)
* Add a repeat penality to the llama2-c command line example.

* Another fix attempt.
2023-09-01 20:38:58 +01:00
19042962d5 Whisper fix (#711)
* Remove unnecessary file.

* Whisper fix.
2023-09-01 20:04:07 +01:00
7529531056 Add the optimizer trait. (#702) 2023-09-01 12:55:39 +01:00
7cef35c84d Tweak some quantized args (#692)
* Print the args + change the default temp/repeat penalty.

* Minor formatting tweak.
2023-08-31 17:25:21 +01:00
7509c98970 Interactive mode for the quantized model. (#690) 2023-08-31 10:52:42 +01:00
9874d843f1 Fix the accelerate build (#678)
* Cosmetic changes.

* Fix the accelerate build for tanh.
2023-08-30 18:31:14 +02:00
7d753d3acd Mnist training dropout (#677)
* Use dropout in the mnist training.

* Fix.
2023-08-30 16:41:01 +01:00
618f4e4c78 Add some documentation. (#673)
* Add some documentation.

* Bump the crate version.
2023-08-30 11:54:00 +01:00
a1a5ab8b0a Neon optimized vecdot (#666)
* Q5k vecdot.

* Add the q3k vecdot.

* Q2k vecdot.

* Move the quantized model to its own file.
2023-08-29 22:28:46 +01:00
2d3fcad267 Simplify usage of the pool functions. (#662)
* Simplify usage of the pool functions.

* Small tweak.

* Attempt at using apply to simplify the convnet definition.
2023-08-29 19:12:16 +01:00
b31d41e26a Add a convnet training example. (#661)
* Add a convnet example.

* Dataset fix.

* Randomize batches.
2023-08-29 18:23:01 +01:00
a044907ffc Dilated convolutions (#657)
* Add the dilation parameter.

* Restore the basic optimizer example.

* Dilation support in cudnn.

* Use the dilation parameter in the cpu backend.

* More dilation support.

* No support for dilation in transposed convolutions.

* Add dilation to a test.

* Remove a print.

* Helper function.
2023-08-29 16:12:11 +01:00
1aca6fa291 Upgrading hf-hub. 2023-08-29 14:18:54 +02:00
14b4d456e8 Merge pull request #439 from huggingface/training_hub_dataset
[Book] Add small error management + start training (with generic dataset inclusion).
2023-08-29 13:10:05 +02:00
62ef494dc1 Use multiple transformer layer in the same cross-attn blocks. (#653)
* Use multiple transformer layer in the same cross-attn blocks.

* Make the context contiguous if required.
2023-08-29 11:13:43 +01:00
33c23c19b6 Preliminary support for SDXL. (#647)
* Preliminary support for SDXL.

* More SDXL support.

* More SDXL.

* Use the proper clip config.

* Querying for existing tensors.

* More robust test.
2023-08-29 09:00:04 +01:00
d726484a6d Re-enable local dir for mnist. 2023-08-28 15:15:27 +02:00
d7a273be51 Training:
- Removed a lot of surface (SerializedFileReader ownership is really
  painful).
- Moved example + vision to hf.co version.
- Removed feature gate.
2023-08-28 15:15:01 +02:00
26e1b40992 Repeat-penalty in the falcon example. (#634) 2023-08-28 08:13:40 +01:00
72ebb12bca Remove some dead-code annotations. (#629)
* Remove some dead-code annotations.

* More dead code removal.

* One more.

* CI fix.
2023-08-27 18:52:33 +01:00
4c338b0cd9 VarBuilder cleanup (#627)
* VarBuilder cleanup.

* Implement the basic varbuilders.

* Add the sharded code.

* Proper support for tensor sharding.
2023-08-27 18:03:26 +01:00
6e485f2deb Add some optional repeat penalty. (#623)
* Add some optional repeat penalty.

* Add the missing files.
2023-08-27 10:48:45 +01:00
aa67e5107d Merge pull request #600 from huggingface/codellama_gpu_support
Adding support for codellama in examples.
2023-08-25 18:25:26 +02:00
c105550405 s/panic/bail/ 2023-08-25 18:05:07 +02:00
ca6c050b04 Cleanup the pose reporting code. (#605) 2023-08-25 16:49:21 +01:00
0afbc435df Add some configurable legend for yolo detection. (#603)
* Add some configurable legend for yolo detection.

* Clippyness.
2023-08-25 13:50:31 +01:00
97909e5068 Move the yolo model bits in a separate file. (#602)
* Move the yolo model bits in a separate file.

* Improve the drawing.

* Bugfix.
2023-08-25 12:47:55 +01:00
8bc5fffa45 More support for pose estimation in yolo-v8. (#599)
* More support for pose estimation in yolo-v8.

* Support both object detection and pose-estimation in the yolo-v8 example.
2023-08-25 11:21:11 +01:00
4826a4212e Adding support for codellama in examples.
Codellama requires bf16 for now (error to convert from bf16 to f16).
Multiprocess demo not functional for it because flash-attn only supports
f16 for now.
2023-08-25 09:56:11 +00:00
c093b03d51 Generic implementation of vecdot for q80. (#596)
* Generic implementation of vecdot for q80.

* Add support for code-llama 7b.

* Support more code-llama.
2023-08-25 09:04:05 +01:00
189442a0fa Add the pose estimation head for yolo. (#589)
* Add the pose estimation head for yolo.

* Properly handle the added position dimensions.

* Integrate the pose estimation head in the forward pass.

* Renaming.

* Fix for pose estimation.
2023-08-24 22:12:34 +01:00
79916c2edb Use the hub weights for efficientnet. (#573) 2023-08-23 18:20:21 +01:00
431051cc32 Add Efficientnet (#572)
* EfficientNet.

* Complete the efficientnet implementation.

* Improve group handling.

* Get the efficientnet to work.
2023-08-23 18:02:58 +01:00
eedd85ffa7 Move the imagenet specific bits to a separate file. (#571) 2023-08-23 16:42:09 +01:00
329f661d9b Trace softmax (#568)
* Trace the softmax op.

* Inline the sum.

* Add min/max vec operations.
2023-08-23 15:25:50 +01:00
aba1e90797 Add some group parameter to convolutions. (#566)
* Add some group parameter to convolutions.

* Avoid some unnecessary groups checks.

* Move the tensor convolution bits.

* Properh handling of groups.

* Bump the crate version.

* And add a changelog.
2023-08-23 12:58:55 +01:00
4ee1cf038a Get the rms epsilon from GGUF. (#565) 2023-08-23 11:40:20 +01:00
0f4ff8a739 Fix the quantized example. (#564) 2023-08-23 11:09:55 +01:00
89a00b56cc add chat models in quantized example (#551)
* add chat models in quantized example

* cargo fmt
2023-08-23 11:05:33 +01:00