Personal tools
You are here: Home Publications Learning to Analyze Binary Computer Code
Document Actions

N.E. Rosenblum, X. Zhu, B.P. Miller, and K. Hunt (2008)

Learning to Analyze Binary Computer Code

In: 23rd AAAI Conference on Artificial Intelligence (AAAI 2008).

We present a novel application of structured classification: identifying function entry points (FEPs, the starting byte of each function) in program binaries. Such identification is the crucial first step in analyzing many malicious, commercial and legacy software, which lack full symbol information that specifies FEPs. Existing pattern-matching FEP detection techniques are insufficient due to variable instruction sequences introduced by compiler and link-time optimizations. We formulate the FEP identification problem as structured classification using Conditional Random Fields. Our Conditional Random Fields incorporate both idiom features to represent the sequence of instructions surrounding FEPs, and control flow structure features to represent the interaction among FEPs. These features allow us to jointly label all FEPs in the binary. We perform feature selection and present an approximate inference method for massive program binaries. We evaluate our models on a large set of real-world test binaries, showing that our models dramatically outperform two existing, standard disassemblers.
by Jennifer Harris last modified 2009-04-21 10:37
« September 2017 »
Su Mo Tu We Th Fr Sa

Powered by Plone

CScADS Collaborators include:

Rice University ANL UCB UTK WISC