Zhiping Zhou (Tianjin University), Xiaohong Li (Tianjin University), Ruitao Feng (Southern Cross University), Yao Zhang (Tianjin University), Yuekang Li (University of New South Wales), Wenbu Feng (Tianjin University), Yunqian Wang (Tianjin University), Yuqing Li (Tianjin University)

Decompilation is a crucial technique that converts machine code into a human-readable format, facilitating analysis and debugging in the absence of source code. However, this process is hindered by fidelity issues, which can significantly impair the readability and accuracy of the decompiled output. Existing approaches partially addressed these, such as variable renaming and structural simplification, but typically fail to provide adequate detection and correction, especially in complex but practical closed-source binary scenarios.

To address this, we introduce FidelityGPT, a novel framework to improve the accuracy and readability of decompiled code by systematically detecting and correcting discrepancies between decompiled code and its original source. FidelityGPT defines distortion prompt templates tailored to closed-source environments and incorporates Retrieval-Augmented Generation (RAG) with a dynamic semantic intensity algorithm. The algorithm identifies distorted lines based on semantic intensity, retrieving similar code from a database. Additionally, a variable dependency algorithm is designed to overcome the limitations of long-context inputs by analyzing redundant variables through their dependencies and integrating redundant variable names into prompt context. These combined techniques establish FidelityGPT as the first framework capable of effectively addressing decompilation distortion issues in LLM-based decompilation optimization.
We evaluated FidelityGPT on 620 function pairs from a binary similarity benchmark, achieving an average detection accuracy of 89% and a precision of 83%. Compared to the current state-of-the-art model, DeGPT, which achieved an average Fix Rate (FR) of 83% and an average Corrected Fix Rate (CFR) of 37%, FidelityGPT demonstrated superior performance. With an average FR of 94% and an average CFR of 64%, FidelityGPTsignificantly improves both accuracy and readability, underscoring its effectiveness in enhancing decompilation and its potential to drive advancements in reverse engineering.

View More Papers

WiFinger: Fingerprinting Noisy IoT Event Traffic Using Packet-level Sequence...

Ronghua Li (The Hong Kong Polytechnic University), Shinan Liu (The University of Hong Kong), Haibo Hu (The Hong Kong Polytechnic University, PolyU Research Centre for Privacy and Security Technologies in Future Smart Systems), Qingqing Ye (The Hong Kong Polytechnic University), Nick Feamster (University of Chicago)

Read More

The Things That Count: Coverage Evaluation Under the Microscope...

Tobias Holl (Ruhr University Bochum), Leon Weiß (Ruhr University Bochum), Kevin Borgolte (Ruhr University Bochum)

Read More

FlyTrap: Physical Distance-Pulling Attack Towards Camera-based Autonomous Target Tracking...

Shaoyuan Xie (University of California, Irvine), Mohamad Habib Fakih (University of California, Irvine), Junchi Lu (University of California, Irvine), Fayzah Alshammari (University of California, Irvine), Ningfei Wang (University of California, Irvine), Takami Sato (University of California, Irvine), Halima Bouzidi (University of California Irvine), Mohammad Abdullah Al Faruque (University of California, Irvine), Qi Alfred Chen (University…

Read More