Advanced Reverse Engineering

PhD Qualifying Examination


Title: "Advanced Reverse Engineering"

by

Mr. Zhibo LIU


Abstract:

Software reverse engineering is a practice of understanding the intentions of 
software developers by analyzing and extracting design and implementation 
information from compiled software. It works as an automatic process of 
converting the incomprehensible binary files into human-readable high-level 
language. As the cornerstone of various cybersecurity tasks, software reverse 
engineering widely enables malware analysis, off-the-shelf software security 
hardening, cross-architecture code reuse, and vulnerability detection, in which 
cases the source code is usually unavailable.

Because of its importance, software reverse engineering has evolved over the 
last few decades and has developed into a complete and systematic process. 
Typically, software reverse engineering on the x86 platform consists of three 
parts as disassembly, lifting, and decompilation. The disassembler translates 
hex values in executable into assembly instructions at the disassembly stage. 
Then the assembly instructions will be lifted into a variety of different 
Intermediate Representations (IRs) by the lifter at the lifting stage. Finally, 
at the decompilation stage, the decompiler converts IR into comprehensible 
high-level language.

Accurate software reverse engineering is notoriously hard because the compiler 
discards unnecessary information during converting source code to machine code. 
Nowadays, there are lots of studies focused on recovering more information from 
binary executables. However, the perfect solution remains do not exist. 
Therefore, to help understand the challenges and the state-of-the-art (SOTA) 
methodology of software reverse engineering techniques, iv we conduct a 
thorough survey related to the existing researches and a detailed comparison 
among the different methods at different stages.

According to the procedure of software reverse engineering, we first introduce 
the difficulties in three stages: disassembly, lifting, and decompilation. 
Afterward, we discuss recent works related to disassembly and reassembleable 
disassembly techniques and their pros and cons. Then we compare and divide 
existing lifting techniques into two categories of emulation-style and 
succinct-style. Moreover, we also show the advanced decompilation techniques 
for converting IR to pseudo C code. Finally, we discuss the emerging fields and 
application scenarios of decompilation technology. After a thorough survey of 
the existing literature, we also suggest several potential future research 
directions that could drive the development of software reverse engineering 
techniques. We believe our survey will benefit the entire reverse engineering 
community.


Date:			Monday, 6 December 2021

Time:                  	1:00pm - 3:00pm

Venue:			Room 3494
 			(lifts 25/26)

Committee Members:	Dr. Shuai Wang (Supervisor)
 			Dr. Charles Zhang (Chairperson)
 			Dr. Dimitris Papadopoulos
 			Dr. Lionel Parreaux


**** ALL are Welcome ****