* Complete the mixformers implementation. * Tweak the attention. * Add the phi-1.5 example. * Improve the phi example. * Bugfix. * Get the phi example to work.
top_p