Practical Symbolic Static Analysis for Effective Bug Detection

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Practical Symbolic Static Analysis for Effective Bug Detection"

by

Mr. Yiyuan GUO


Abstract:

Static analysis is a powerful approach that automatically detects software 
bugs by analyzing the program behavior without requiring execution. Many 
static analysis techniques leverage symbolic methods to achieve greater 
precision, where they represent unknown values using symbolic variables and 
describe program states with symbolic formulas. When applying the symbolic 
method to detect bugs in realistic, million-line codebases, we encounter two 
challenges. First, the inherent limitations of static analysis often lead to 
the introduction of many under-constrained symbolic variables, causing 
significant imprecision. Second, the scalability of symbolic methods is 
influenced by their memory model, especially when intricate interactions 
between the memory and numeric values exist or when the analysis is 
path-sensitive. This thesis presents contributions to improve the precision 
and scalability of the symbolic method for enabling practical static bug 
detection at an industrial scale.

The first work focuses on the static detection of divide-by-zero bugs, a 
client where the presence of under-constrained variables severely degrades 
the precision of symbolic methods. Based on an extensive empirical study, we 
propose to gather extra evidence about the under-constrained variables by 
inferring the programmers' beliefs, which helps the static analyzer to 
significantly improve its precision by reporting bugs that are more likely 
to be true.

The second work investigates the problem of static buffer overflow 
detection, which involves mutually dependent program properties of heap and 
numeric value. Existing methods either sacrifice precision or efficiency in 
addressing the mutual dependency. To address this dilemma, we propose a 
novel memory model based on heap disjointness and design a summary-based 
analysis algorithm, thereby enabling the precise and scalable detection of 
buffer overflow bugs.

The third work concentrates on the scalability problem faced by the 
path-sensitive memory model in data dependence analysis. We observe that a 
key performance bottleneck of the analysis is to infer the condition under 
which storing to a memory location may overwrite its old containing value. 
Our solution improves the efficiency by decomposing the analysis efforts 
into stages: We handle most of the memory updates efficiently based on a 
must-kill relation among the heap stores, while reserving the expensive 
path-sensitive analysis for the rest.


Date:                   Monday, 26 May 2025

Time:                   3:00pm - 5:00pm

Venue:                  Room 3494
                        Lifts 25/26

Chairman:               Prof. James Yeong Liang THONG (ISOM)

Committee Members:      Prof. Charles ZHANG (Supervisor)
                        Prof. Qiong LUO
                        Dr. Shuai WANG
                        Dr. Shenghui SONG (ECE)
                        Dr. Dongmei ZHANG (MSRA)