Derrick McKee (Purdue University), Nathan Burow (MIT Lincoln Laboratory), Mathias Payer (EPFL)

Reverse engineering unknown binaries is a difficult, resource intensive process due to information loss and optimizations performed by compilers that introduce significant binary diversity. Existing binary similarity approaches do not scale or are inaccurate. In this paper, we introduce IOVec Function Identification (IOVFI), which assesses similarity based on program state transformations, which compilers largely guarantee even across compilation environments and architectures. IOVFI executes functions with initial predetermined program states, measures the resulting program state changes, and uses the sets of input and output state vectors as unique semantic fingerprints. Since IOVFI relies on state vectors, and not code measurements, it withstands broad changes in compilers and optimizations used to generate a binary.

Evaluating our IOVFI implementation as a semantic function identifier for coreutils-8.32, we achieve a high .773 average F-Score, indicating high precision and recall. When identifying functions generated from differing compilation environments, IOVFI achieves a 100% accuracy improvement over BinDiff 6, outperforms asm2vec in cross-compilation environment accuracy, and, when compared to dynamic frameworks, BLEX and IMF-SIM, IOVFI is 25%–53% more accurate.

View More Papers

Smarter Contracts: Detecting Vulnerabilities in Smart Contracts with Deep...

Christoph Sendner (University of Wuerzburg), Huili Chen (University of California San Diego), Hossein Fereidooni (Technische Universität Darmstadt), Lukas Petzi (University of Wuerzburg), Jan König (University of Wuerzburg), Jasper Stang (University of Wuerzburg), Alexandra Dmitrienko (University of Wuerzburg), Ahmad-Reza Sadeghi (Technical University of Darmstadt), Farinaz Koushanfar (University of California San Diego)

Read More

Efficient Dynamic Proof of Retrievability for Cold Storage

Tung Le (Virginia Tech), Pengzhi Huang (Cornell University), Attila A. Yavuz (University of South Florida), Elaine Shi (CMU), Thang Hoang (Virginia Tech)

Read More

DRAGON: Predicting Decompiled Variable Data Types with Learned Confidence...

Caleb Stewart, Rhonda Gaede, Jeffrey Kulick (University of Alabama in Huntsville)

Read More

Backdoor Attacks Against Dataset Distillation

Yugeng Liu (CISPA Helmholtz Center for Information Security), Zheng Li (CISPA Helmholtz Center for Information Security), Michael Backes (CISPA Helmholtz Center for Information Security), Yun Shen (Netapp), Yang Zhang (CISPA Helmholtz Center for Information Security)

Read More