* optimize KV cache to reduce GPU memory usage * revert to using candle_nn::kv_cache::KvCache with initial capacity of 512
* fixed quantized_phi3 implementation * quantized_qwen3 implementation * Update quantized_phi3.rs * Update quantized_phi3.rs * add quantized_qwen3 example * Clippy fixes. * Cleanup. --------- Co-authored-by: Laurent <laurent.mazare@gmail.com>