Yue Duan (Cornell University), Xuezixiang Li (UC Riverside), Jinghan Wang (UC Riverside), Heng Yin (UC Riverside)

Binary diffing analysis quantitatively measures the differences between two given binaries and produces fine-grained basic block matching. It has been widely used to enable different kinds of critical security analysis. However, all existing program analysis and machine learning based techniques suffer from low accuracy, poor scalability, coarse granularity, or require extensive labeled training data to function. In this paper, we propose an unsupervised program-wide code representation learning technique to solve the problem. We rely on both the code semantic information and the program-wide control flow information to generate block embeddings. Furthermore, we propose a k-hop greedy matching algorithm to find the optimal diffing results using the generated block embeddings. We implement a prototype called DeepBinDiff and evaluate its effectiveness and efficiency with large number of binaries. The results show that our tool could outperform the state-of-the-art binary diffing tools by a large margin for both cross-version and cross-optimization level diffing. A case study for OpenSSL using real-world vulnerabilities further demonstrates the usefulness of our system.

View More Papers

NoJITsu: Locking Down JavaScript Engines

Taemin Park (University of California, Irvine), Karel Dhondt (imec-DistriNet, KU Leuven), David Gens (University of California, Irvine), Yeoul Na (University of California, Irvine), Stijn Volckaert (imec-DistriNet, KU Leuven), Michael Franz (University of California, Irvine, USA)

Read More

Learning-based Practical Smartphone Eavesdropping with Built-in Accelerometer

Zhongjie Ba (Zhejiang University and McGill University), Tianhang Zheng (University of Toronto), Xinyu Zhang (Zhejiang University), Zhan Qin (Zhejiang University), Baochun Li (University of Toronto), Xue Liu (McGill University), Kui Ren (Zhejiang University)

Read More

SymTCP: Eluding Stateful Deep Packet Inspection with Automated Discrepancy...

Zhongjie Wang (University of California, Riverside), Shitong Zhu (University of California, Riverside), Yue Cao (University of California, Riverside), Zhiyun Qian (University of California, Riverside), Chengyu Song (University of California, Riverside), Srikanth V. Krishnamurthy (University of California, Riverside), Kevin S. Chan (U.S. Army Research Lab), Tracy D. Braun (U.S. Army Research Lab)

Read More

When Malware is Packin' Heat; Limits of Machine Learning...

Hojjat Aghakhani (University of California, Santa Barbara), Fabio Gritti (University of California, Santa Barbara), Francesco Mecca (Università degli Studi di Torino), Martina Lindorfer (TU Wien), Stefano Ortolani (Lastline Inc.), Davide Balzarotti (Eurecom), Giovanni Vigna (University of California, Santa Barbara), Christopher Kruegel (University of California, Santa Barbara)

Read More