ZipCache: Accurate and Efficient KV Cache Quantization with Salient Token Identification

Publication
arXiv preprint arXiv:2405.14256