Zhexi Lu (Rensselaer Polytechnic Institute), Hongliang Chi (Rensselaer Polytechnic Institute), Nathalie Baracaldo (IBM Research - Almaden), Swanand Ravindra Kadhe (IBM Research - Almaden), Yuseok Jeon (Korea University), Lei Yu (Rensselaer Polytechnic Institute)

Membership inference attacks (MIAs) pose a critical privacy threat to fine-tuned large language models (LLMs), especially when models are adapted to domain-specific tasks using sensitive data. While prior black-box MIA techniques rely on confidence scores or token likelihoods, these signals are often entangled with a sample’s intrinsic properties—such as content difficulty or rarity—leading to poor generalization and low signal-to-noise ratios. In this paper, we propose ICP-MIA, a novel MIA framework grounded in the theory of training dynamics, particularly the phenomenon of diminishing returns during optimization. We introduce the Optimization Gap as a fundamental signal of membership: at convergence, member samples exhibit minimal remaining loss-reduction potential, while non-members retain significant potential for further optimization. To estimate this gap in a black-box setting, we propose In-Context Probing (ICP)—a training-free method that simulates fine-tuning-like behavior via strategically constructed input contexts. We propose two probing strategies: reference-data-based (using semantically similar public samples) and self-perturbation (via masking or generation). Experiments on three tasks and multiple LLMs show that ICP-MIA significantly outperforms prior black-box MIAs, particularly at low false positive rates. We further analyze how reference data alignment, model type, PEFT configurations, and training schedules affect attack effectiveness. Our findings establish ICP-MIA as a practical and theoretically grounded framework for auditing privacy risks in deployed LLMs.

View More Papers

Decompiling the Synergy: An Empirical Study of Human–LLM Teaming...

Zion Leonahenahe Basque (Arizona State University), Samuele Doria (University of Padua), Ananta Soneji (Arizona State University), Wil Gibbs (Arizona State University), Adam Doupe (Arizona State University), Yan Shoshitaishvili (Arizona State University), Eleonora Losiouk (University of Padua), Ruoyu “Fish” Wang (Arizona State University), Simone Aonzo (EURECOM)

Read More

What Are Brands Telling You About Smishing? A Cross-Industry...

Dev Vikesh Doshi (California State University San Marcos), Mehjabeen Tasnim (California State University San Marcos), Fernando Landeros (California State University San Marcos), Chinthagumpala Muni Venkatesh (California State University San Marcos), Daniel Timko (Emerging Threats Lab / Smishtank.com), Muhammad Lutfor Rahman (California State University San Marcos)

Read More

Pando: Extremely Scalable BFT Based on Committee Sampling

Xin Wang (Tsinghua University), Haochen Wang (Tsinghua University), Haibin Zhang (Yangtze Delta Region Institute of Tsinghua University, Zhejiang), Sisi Duan (Tsinghua University)

Read More