* Start adding vision-transformers. * Add self-attn. * More vision transformers. * vit-vit. * Add the actual vit model. * Add the example code for the vision transformers.