Ruijie Meng (National University of Singapore, Singapore), Martin Mirchev (National University of Singapore), Marcel Böhme (MPI-SP, Germany and Monash University, Australia), Abhik Roychoudhury (National University of Singapore)

How to find security flaws in a protocol implementation without a machine-readable specification of the protocol? Facing the internet, protocol implementations are particularly security-critical software systems where inputs must adhere to a specific structure and order that is often informally specified in hundreds of pages in natural language (RFC). Without some machine-readable version of that protocol, it is difficult to automatically generate valid test inputs for its implementation that follow the required structure and order. It is possible to partially alleviate this challenge using mutational fuzzing on a set of recorded message sequences as seed inputs. However, the set of available seeds is often quite limited and will hardly cover the great diversity of protocol states and input structures.

In this paper, we explore the opportunities of systematic interaction with a pre-trained large language models (LLM) which has ingested millions of pages of human-readable protocol specifications, to draw out machine-readable information about the protocol that can be used during protocol fuzzing. We use the knowledge of the LLMs about protocol message types for well-known protocols. We also checked the LLM's capability in detecting ``states" for stateful protocol implementations by generating sequences of messages and predicting response codes. Based on these observations, we have developed an LLM-guided protocol implementation fuzzing engine. Our protocol fuzzer ChatAFL constructs grammars for each message type in a protocol, and then mutates messages or predicts the next messages in a message sequence via interactions with LLMs. Experiments on a wide range of real-world protocols from ProFuzzbench show significant efficacy in state and code coverage. Our LLM-guided stateful fuzzer was compared with state-of-the-art fuzzers AFLNet and NSFuzz. ChatAFL covers 47.6% and 42.7% more state transitions, 29.6% and 25.8% more states, and 5.8% and 6.7% more code, respectively. Apart from enhanced coverage, ChatAFL discovered nine distinct and previously unknown vulnerabilities in widely-used and extensively-tested protocol implementations while AFLNet and NSFuzz only discover three and four of them, respectively.

View More Papers

An Experimental Study on Attacking Homogeneous Averaging Processes via...

Olsan Ozbay (Dept. ECE, University of Maryland), Yuntao Liu (ISR, University of Maryland), Ankur Srivastava (Dept. ECE, ISR, University of Maryland)

Read More

ReqsMiner: Automated Discovery of CDN Forwarding Request Inconsistencies and...

Linkai Zheng (Tsinghua University), Xiang Li (Tsinghua University), Chuhan Wang (Tsinghua University), Run Guo (Tsinghua University), Haixin Duan (Tsinghua University; Quancheng Laboratory), Jianjun Chen (Tsinghua University; Zhongguancun Laboratory), Chao Zhang (Tsinghua University; Zhongguancun Laboratory), Kaiwen Shen (Tsinghua University)

Read More

LiDAR Spoofing Meets the New-Gen: Capability Improvements, Broken Assumptions,...

Takami Sato (University of California, Irvine), Yuki Hayakawa (Keio University), Ryo Suzuki (Keio University), Yohsuke Shiiki (Keio University), Kentaro Yoshioka (Keio University), Qi Alfred Chen (University of California, Irvine)

Read More

Why People Still Fall for Phishing Emails: An Empirical...

Asangi Jayatilaka (Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide, School of Computing Technologies, RMIT University), Nalin Asanka Gamagedara Arachchilage (School of Computer Science, The University of Auckland), M. Ali Babar (Centre for Research on Engineering Software Technologies (CREST), The University of Adelaide)

Read More