Solidifying and Scaling SMT-based Program Analysis

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Solidifying and Scaling SMT-based Program Analysis"

By

Mr. Peisen YAO


Abstract

Software bugs such as program crashes and security vulnerabilities are 
still a widespread plague. Program analysis is a technique for inferring a 
program's possible behavior and detecting common software bugs. Over the 
past two decades, Satisfiability Modulo Theories (SMT) solvers have served 
as the core engine of many program analysis techniques, such as symbolic 
execution. Although there have been many success stories of SMT-based 
program analysis, we still observe the difficulty in applying them at an 
industrial scale. Reliability and scalability, among other reasons, are 
two key obstacles to adoption. To improve the reliability and scalability 
of SMT-based program analysis, this thesis makes two major contributions 
by exploring the use of abstraction.

The first work advances the automated testing of SMT solvers, which are 
crucial for the reliability of SMT-based program analysis. Due to the test 
oracle problem, existing approaches to testing SMT solvers are either too 
costly or find difficulties generalizing to different solvers and 
theories. To complement existing approaches and overcome their weaknesses, 
we introduce two new general techniques: a general, logical 
approximation-based metamorphic relation and skeletal approximation 
enumeration, a lightweight methodology for approximating formulas. Our 
approach had found 72 confirmed bugs in Z3 and CVC4, two widely-used and 
comprehensively tested SMT solvers.

The second contribution is to make symbolic abstraction, a specific class 
of SMT-based program analysis, more scalable. Symbolic abstraction is an 
important point in the space of abstract interpretation, as it allows for 
automatically synthesizing the most precise abstract transformers. 
However, current techniques have difficulty delivering on their practical 
strengths due to performance issues. We introduce two algorithms that 
apply to the bit-vector interval domain and a certain kind of polyhedral 
domain, respectively. Our algorithms substantially speed up existing 
techniques and bring significant precision advantages to two program 
analysis clients. Our work presents strong evidence that symbolic 
abstraction of numeric domains can be efficient and practical for large 
and realistic programs.


Date:			Friday, 1 April 2022

Time:			9:00am - 11:00am

Zoom Meeting: 		https://hkust.zoom.us/j/9933249158

Chairperson:		Prof. Shaojie SHEN (ECE)

Committee Members:	Prof. Charles ZHANG (Supervisor)
 			Prof. Shing Chi CHEUNG
 			Prof. Fangzhen LIN
 			Prof. Hongce ZHANG (Microelectronics)
 			Prof. Marinov DARKO (Univ of Illinois)


**** ALL are Welcome ****