When Cache Poisoning Meets LLM Systems: Semantic Cache Poisoning and Its Countermeasures

Guanlong Wu (Southern University of Science and Technology), Taojie Wang (Southern University of Science and Technology), Yao Zhang (ByteDance Inc.), Zheng Zhang (Southern University of Science and Technolog), Jianyu Niu (Southern University of Science and Technology), Ye Wu (ByteDance Inc.), Yinqian Zhang (SUSTech)

The emergence of large language models (LLMs) has enabled a wide range of applications, including code generation, chatbots, and AI agents. However, deploying these applications faces substantial challenges in terms of cost and efficiency. One notable optimization to address these challenges is semantic caching, which reuses query-response pairs across users based on semantic similarity. This mechanism has gained significant traction in both academia and industry and has been integrated into the LLM serving infrastructure of cloud providers such as Azure, AWS, and Alibaba.

This paper is the first to show that semantic caching is vulnerable to cache poisoning attacks, where an attacker injects crafted cache entries to cause others to receive attacker-defined responses. We demonstrate the semantic cache poisoning attack in diverse scenarios and confirm its practicality across all three major public clouds. Building on the attack, we evaluate existing adversarial prompting defenses and find they are ineffective against semantic cache poisoning, leading us to propose a new defense mechanism that demonstrates improved protection compared to existing approaches, though complete mitigation remains challenging. Our study reveals that cache poisoning, a long-standing security concern, has re-emerged in LLM systems. While our analysis focuses on semantic cache, the underlying risks may extend to other types of caching mechanisms used in LLM systems.

Paper

View More Papers

Formal Analysis of BLE Secure Connection Pairing and Revelation...

Min Shi (Wuhan University), Yongkang Xiao (Wuhan University), Jing Chen (Wuhan University), Kun He (Wuhan University), Ruiying Du (Wuhan University), Meng Jia (Department of Computing, the Hong Kong Polytechnic University)

ProtocolGuard: Detecting Protocol Non-compliance Bugs via LLM-guided Static Analysis...

Xiangpu Song (Shandong University), Longjia Pei (Shandong University), Jianliang Wu (Simon Fraser University), Yingpei Zeng (Hangzhou Dianzi University), Gaoshuo He (Shandong University), Chaoshun Zuo (Independent Researcher), Xiaofeng Liu (Shandong University), Qingchuan Zhao (City University of Hong Kong), Shanqing Guo (Shandong University)

From Perception to Protection: A Developer-Centered Study of Security...

Kunlin Cai (University of California, Los Angeles), Jinghuai Zhang (University of California, Los Angeles), Ying Li (University of California, Los Angeles), Zhiyuan Wang (University of Virginia), Xun Chen (Independent Researcher), Tianshi Li (Northeastern University), Yuan Tian (University of California, Los Angeles)