XiangFan Wu (Ocean University of China; QI-ANXIN Technology Research Institute), Lingyun Ying (QI-ANXIN Technology Research Institute), Guoqiang Chen (QI-ANXIN Technology Research Institute), Yacong Gu (Tsinghua University; Tsinghua University-QI-ANXIN Group JCNS), Haipeng Qu (Department of Computer Science and Technology, Ocean University of China)

Large Language Models (LLMs) are rapidly reshaping digital interactions. Their performance and efficiency are critically dependent on advanced caching mechanisms, such as prefix caching and semantic caching.
However, these mechanisms introduce a new attack surface. Unlike prior work focused on LLMs poisoning attacks during the training phase, this paper presents the first comprehensive investigation into cache-related security risks that arise during the LLM inference-time.

We conducted a systematic study of the cache implementations in mainstream LLM serving frameworks and then identified six novel attack vectors categorized as: (1) User-oriented Fraud Attacks, which manipulate cache entries to deliver malicious content to users via prefix cache collisions and semantic fuzzy poisoning; and (2) System Integrity Attacks, which exploit cache vulnerabilities to bypass security checks, such as using block-wise or multimodal collisions to evade content moderation.
Our experiments on leading open-source frameworks validated these attack vectors and evaluated their impact and cost.
Furthermore, we proposed five multilayer defense strategies and assessed their effectiveness.
We responsibly disclosed our findings to affected vendors, including vLLM, SGLang, GPTCache, AIBrix, rtp-llm and LMDeploy. All of them have acknowledged the vulnerabilities, and notably, vLLM, GPTCache, and AIBrix have adopted our proposed mitigation methods and fixed their vulnerabilities.
Our findings underscore the importance of secure the caching infrastructure in the rapidly expanding LLM ecosystem.

View More Papers

PhyFuzz: Detecting Sensor Vulnerabilities with Physical Signal Fuzzing

Zhicong Zheng (Zhejiang University), Jinghui Wu (Zhejiang University), Shilin Xiao (Zhejiang University), Yanze Ren (Zhejiang University), Chen Yan (Zhejiang University), Xiaoyu Ji (Zhejiang University), Wenyuan Xu (Zhejiang University)

Read More

A Comparative Study of Program Graph Effectiveness for Binary...

Michael Kadoshnikov, Clemente Izurieta, Matthew Revelle (Montana State University)

Read More