Marius Vangeli (KTH Royal Institute of Technology, Sweden), Joel Brynielsson (KTH Royal Institute of Technology, Sweden and FOI Swedish Defence Research Agency, Sweden), Mika Cohen (KTH Royal Institute of Technology, Sweden and FOI Swedish Defence Research Agency, Sweden), Farzad Kamrani (FOI Swedish Defence Research Agency, Sweden)

While large language model (LLM)-driven penetration testing is rapidly improving, autonomous agents still struggle with longer-duration multi-stage exploits. As agents perform reconnaissance, attempt exploits, and pivot through systems, the token context window fills up with exploration and failed attempts, degrading decision quality. We introduce context handoff for autonomous penetration testing (CHAP), a context-relay system for LLM-driven agents. CHAP enables agents to sustain long-running penetration tests by transferring accumulated knowledge as compact protocols to fresh agent instances.

We evaluate CHAP on an extended version of the AutoPen- Bench benchmark, targeting 11 real-world vulnerabilities. CHAP improved per-run success from 27.3% to 36.4% while reducing token expenditure by 32.4% compared to a baseline agent. We release our full implementation, benchmark enhancements, and a dataset of command logs with LLM reasoning traces.

View More Papers

QNBAD: Quantum Noise-induced Backdoor Attacks against Zero Noise Extrapolation

Cheng Chu (Indiana University Bloomington), Qian Lou (University of Central Florida), Fan Chen (Indiana University Bloomington), Lei Jiang (Indiana University Bloomington)

Read More

Token Time Bomb: Evaluating JWT Implementations for Vulnerability Discovery

Jingcheng Yang (Tsinghua University), Enze Wang (National University of Defense Technology & Tsinghua University), Jianjun Chen (Tsinghua University), Qi Wang (Tsinghua University), Yuheng Zhang (Tsinghua University), Haixin Duan (Quancheng Lab,Tsinghua University), Wei Xie (College of Computer, National University of Defense Technology), Baosheng Wang (National University of Defense Technology)

Read More

From Matrix to Metrics: Introducing and Applying a Configuration...

Tobias Länge (SECUSO, Karlsruhe Institute of Technology, Karlsruhe, Germany), Fabian Lucas Ballreich (SECUSO, Karlsruhe Institute of Technology, Karlsruhe, Germany), Anne Hennig (SECUSO, Karlsruhe Institute of Technology, Karlsruhe, Germany), Peter Mayer (SECUSO, Karlsruhe Institute of Technology, Karlsruhe, Germany), Melanie Volkamer (SECUSO, Karlsruhe Institute of Technology, Karlsruhe, Germany)

Read More