f3fe730a30
Npy tweaks & error with path ( #384 )
...
* Simplify the npy writing.
* Wrap the file path so as to provide better errors.
2023-08-10 06:21:58 +01:00
c7f92f985e
Further randn tweaks: use the appropriate rng rather than the f64 one, some cleanup. ( #383 )
2023-08-10 05:48:19 +01:00
3bbc08a8df
Fix randn cpu ( #382 )
...
* Change distributions
Standard generates in [0, 1), Normal is correct.
* Add test
Not sure if this is the best place to put the test
* Remove unnecessary use
2023-08-10 05:33:44 +01:00
25ec2d9f6b
fix: remove incorrect unwrap ( #379 )
2023-08-09 21:45:24 +01:00
fcfdcbd337
Add a conv1d benchmark based on the whisper sizes. ( #377 )
...
* Add a conv1d benchmark based on the whisper sizes.
* Enforce the batch-dim in conv1d.
2023-08-09 20:27:03 +01:00
a5c5a893aa
add max_pool2d ( #371 )
...
Co-authored-by: 赵理山 <ls@zhaolishandeMacBook-Air.local >
2023-08-09 18:05:26 +01:00
1892bd139c
Extract the strides in the conv ops. ( #370 )
2023-08-09 17:57:05 +01:00
cd225bd3b1
More testing for avg-pool2d. ( #366 )
...
* More testing for avg-pool2d.
* Another fix.
* Add a max-pool test with non-divisible kernel sizes.
2023-08-09 16:12:23 +01:00
dece0b8a76
Merge pull request #263 from huggingface/book_3
...
Book 3 (advanced loading + hub)
2023-08-09 16:50:11 +02:00
b80348d22f
Bugfix for avg-pool + add some test. ( #365 )
2023-08-09 15:44:16 +01:00
dbc6f281c9
Conv1d test with padding. ( #356 )
2023-08-09 05:45:38 +01:00
cf965ecaa8
Simplify the conv1d and conv2d code. ( #352 )
2023-08-08 22:10:59 +01:00
b9864e1357
Fix size-in-bytes for u8. ( #351 )
2023-08-08 21:15:18 +01:00
608b2358c6
Add some conv1d test + bugfix using padding. ( #349 )
2023-08-08 20:50:20 +01:00
1e6dbeac01
Add some conv2d tests. ( #347 )
...
* Add some conv2d tests.
* Add a simpler conv2d test.
* More conv2d testing + bugfix.
* Add a todo.
2023-08-08 19:02:42 +01:00
13ce68ff9b
Bugfix for conv2d. ( #343 )
2023-08-08 15:20:00 +01:00
ab35684326
Naive implementation for conv2d. ( #341 )
2023-08-08 06:34:36 +01:00
b5bb5e056d
Add more conv2d support. ( #340 )
...
* Add more conv2d support.
* Conv2d cpu work.
* Conv2d output shape.
2023-08-08 06:04:32 +01:00
d0d7010682
CPU implementation for upsample-nearest2d. ( #339 )
2023-08-07 20:07:10 +01:00
fc265d9dcf
Some CLIP fixes for stable diffusion. ( #338 )
...
* Some CLIP fixes for stable diffusion.
* Add the avg-pool2d operation on cpu.
2023-08-07 18:31:45 +01:00
2345b8ce3f
Skeleton for the avg-pool2d and upsample-nearest2d ops. ( #337 )
...
* Skeleton for the avg-pool2d and upsample-nearest2d ops.
* Preliminary conv2d support.
2023-08-07 16:15:38 +01:00
f53a333ea9
Simple pad support. ( #336 )
...
* Simple pad support.
* Fix the tensor indexing when padding.
2023-08-07 15:24:56 +01:00
2c9f605976
Add rand-like/randn-like. ( #333 )
2023-08-06 21:51:08 +01:00
166bfd5847
Add the recip op + use it in stable-diffusion. ( #331 )
...
* Add the recip unary op.
* Fix the cuda kernel.
* Use the recip op in sigmoid.
2023-08-06 21:14:52 +01:00
d34039e352
Add a stable diffusion example ( #328 )
...
* Start adding a stable-diffusion example.
* Proper computation of the causal mask.
* Add the chunk operation.
* Work in progress: port the attention module.
* Add some dummy modules for conv2d and group-norm, get the attention module to compile.
* Re-enable the 2d convolution.
* Add the embeddings module.
* Add the resnet module.
* Add the unet blocks.
* Add the unet.
* And add the variational auto-encoder.
* Use the pad function from utils.
2023-08-06 17:49:43 +01:00
b278834267
Support the Accelerate BLAS on macOS. ( #325 )
...
* Add the accelerate feature.
* Ffi tweaks.
2023-08-05 17:25:24 +01:00
f7b2a0391d
Transpose the weight matrixes for llama2.c. ( #321 )
2023-08-04 13:32:20 +01:00
8b6f5be1cc
Support q5k quantized data. ( #320 )
2023-08-04 09:51:30 +01:00
74845a4dcd
Use the assert! function as it turns out to be const. ( #316 )
2023-08-03 10:03:43 +01:00
aa76b783eb
Q6K dequantization. ( #315 )
2023-08-03 09:31:20 +01:00
25564357f7
Support some ggml quantized types ( #314 )
...
* Add the quantized types for GGML loading.
* Support quantization for Q2K.
* More quantization support.
* Fix some clippy lints.
2023-08-03 09:16:26 +01:00
634700d84a
Use some consts for ggml values. ( #312 )
2023-08-02 22:03:05 +01:00
e635f18eda
Initial support for reading ggml files. ( #311 )
...
* Start adding support for reading ggml files.
* Compute the proper tensor size.
* Print the read tensors.
* Fix file reading.
2023-08-02 21:59:02 +01:00
a44471a305
Adding more details on how to load things.
...
- Loading with memmap
- Loading a sharded tensor
- Moved some snippets to `candle-examples/src/lib.rs` This is because
managing book specific dependencies is a pain https://github.com/rust-lang/mdBook/issues/706
- This causes a non aligned inclusion https://github.com/rust-lang/mdBook/pull/1856 which we have
to ignore fmt to remove.
mdbook might need some more love :)
2023-08-02 18:40:24 +02:00
0902846f25
Add the AdamW optimizer. ( #307 )
...
* Add the AdamW optimizer.
* Add some AdamW test validated against PyTorch.
2023-08-02 14:03:49 +01:00
4fe8a02f88
Update the repo location. ( #305 )
2023-08-02 11:12:18 +01:00
d38943aadc
Add version numbers for all the candle crates ( #303 )
...
* Switch to candle-gemm for the time being.
* Add the missing versions.
2023-08-02 10:52:13 +01:00
51e51da896
Rename the candle crate to candle-core ( #301 )
...
* Rename to candle-core.
* More candle-core renaming.
2023-08-02 08:20:22 +01:00
4b3bd79fbd
Remove the embedding ops in favor of index-select. ( #299 )
...
* Remove the embedding ops in favor of index-select.
* Also remove the cuda kernels.
2023-08-02 05:42:11 +01:00
cc76c63202
Use index-select for the embeddings as it supports backprop. ( #298 )
2023-08-01 20:44:43 +01:00
a27239f3d9
Add training for the llama2.c example ( #296 )
...
* Rework the commands and run inference by default.
* Add the training module and load the training dataset.
* Random dataset iterator.
* Proper valid-loss computation.
* Compute the evaluation loss.
* Add more substance to the training loop.
2023-08-01 17:23:07 +01:00
afb5e24a63
Remove map ownership from save
.
2023-08-01 17:19:22 +02:00
89d1fd03e5
Adding new surface for savetensors (global load, global save).
2023-08-01 15:00:38 +02:00
310094310b
Modifying safetensors
export to get simple load and save.
2023-08-01 15:00:38 +02:00
ad9d8fe400
Complexifying our hello world
2023-08-01 14:26:02 +02:00
6b98b66eb3
Remove the end of text tokens. ( #289 )
2023-07-31 20:43:57 +01:00
38ff693af0
Add a flag to save the trained weights. ( #279 )
2023-07-30 15:41:42 +01:00
c950a5c6b1
Cuda support for the mnist training. ( #277 )
...
* Cuda support for the mnist training.
* min/max fix + testing.
* Add the argmin/argmax tests.
* More cuda support for argmin/argmax.
* Cuda kernels for argmin and argmax.
2023-07-29 19:48:04 +01:00
16c33383eb
Improve the mnist training example. ( #276 )
...
* Improve the mnist training example.
* Add some initialization routine that can be used for nn.
* Proper initialization in the mnist example.
2023-07-29 16:28:22 +01:00
c0a8ed19eb
Support for where-cond on cuda for u8 and u32. ( #274 )
2023-07-29 11:48:58 +01:00