Mengyuan Sun (Wuhan University), Yu Li (Wuhan University), Yunjie Ge (Wuhan University), Yuchen Liu (Wuhan University), Bo Du (Wuhan University), Qian Wang (Wuhan University)

Multimodal contrastive learning models like CLIP have demonstrated remarkable vision-language alignment capabilities and now serve as foundational components in many large-scale multimodal systems. However, their vulnerability to backdoor attacks poses critical security risks. Attackers can implant latent triggers that persist through downstream tasks, enabling malicious control of model behavior upon trigger presentation. Despite great success in recent defense mechanisms, they remain impractical due to strong assumptions about attacker knowledge or excessive clean data requirements.

In this paper, we introduce InverTune, the first backdoor defense framework for multimodal models under minimal attacker assumptions, requiring neither prior knowledge of attack targets nor access to the poisoned dataset. Unlike existing defense methods that rely on the same dataset used in the poisoning stage, InverTune effectively identifies and removes backdoor artifacts through three key components, achieving robust protection against backdoor attacks. Specifically, (1) InverTune first exposes attack signatures through adversarial simulation, probabilistically identifying the target label by analyzing model response patterns. (2) Building on this, we develop a gradient inversion technique to reconstruct latent triggers through activation pattern analysis. (3) Finally, a clustering-guided fine-tuning strategy is employed to erase the backdoor function with only a small amount of arbitrary clean data, while preserving the original model capabilities. Experimental results show that InverTune reduces the average attack success rate (ASR) by 97.87% against the state-of-the-art (SOTA) attacks while limiting clean accuracy (CA) degradation to just 3.07%. This work establishes a new paradigm for securing multimodal systems, advancing security in foundation model deployment without compromising performance.

View More Papers

Poster: Probabilistic Chunk-Dispersed Routing for Mitigating Link-Flooding Attack in...

Hyeon-Min Choi (Incheon National University), Jae-Hyeon Park (Incheon National University), Eun-Kyu Lee (Incheon National University)

Read More

LinkGuard: A Lightweight State-Aware Runtime Guard Against Link Following...

Bocheng Xiang (Fudan University), Yuan Zhang (Fudan University), Hao Huang (Fudan university), Fengyu Liu (Fudan University), Youkun Shi (Fudan University)

Read More

Mobius: Enabling Byzantine-Resilient Single Secret Leader Election with Uniquely...

Hanyue Dou (Institute of Software, Chinese Academy of Sciences; the School of Computer Science and Technology, University of Chinese Academy of Sciences), Peifang Ni (Institute of Software, Chinese Academy of Sciences; Zhongguancun Laboratory), Yingzi Gao (Shandong University), Jing Xu (Institute of Software, Chinese Academy of Sciences; Zhongguancun Laboratory)

Read More