Shanshan Han (University of California, Irvine), Wenxuan Wu (Texas A&M University), Baturalp Buyukates (University of Birmingham), Weizhao Jin (University of Southern California), Qifan Zhang (Palo Alto Networks), Yuhang Yao (Carnegie Mellon University), Salman Avestimehr (University of Southern California)
Federated Learning (FL) systems are susceptible to adversarial attacks, such as model poisoning attacks and backdoor attacks. Existing defense mechanisms face critical limitations in deployments, such as relying on impractical assumptions (e.g., adversaries acknowledging the presence of attacks before attacking) or undermining accuracy in model training, even in benign scenarios. To address these challenges, we propose CustodianFL, a two-staged anomaly detection method specifically designed for FL deployments. In the first stage, it flags suspicious client activities. In the second stage that is activated only when needed, it further examines these candidates using the 3σ rule to identify and exclude truly malicious local models from FL training. To ensure integrity and transparency within the FL system, CustodianFL integrates zero-knowledge proofs, enabling clients to cryptographically verify the server’s detection process without relying on the server’s goodwill. CustodianFL operates without unrealistic assumptions and avoids interfering with FL training in attack-free scenarios. It bridges the gap between theoretical advances in FL security and the practical demands of real FL systems. Experimental results demonstrate that CustodianFL consistently delivers performance comparable to benign cases, highlighting its effectiveness in identifying and eliminating malicious models with high accuracy.