Scalable Value-Flow Analysis for Industrial Software

PhD Thesis Proposal Defence


Title: "Scalable Value-Flow Analysis for Industrial Software"

by

Mr. Yongchao WANG


Abstract:

Industrial software systems, such as TikTok and Alipay, serve over a billion
users and are characterized by immense complexity. Both their client-side
applications and backend services form vast ecosystems of numerous
interconnected components. Concurrently, the growing emphasis on privacy,
security, and reliability has elevated static analysis to a critical role in
the application vetting process.

Value-flow analysis is a static analysis technique that models program
behavior without execution. Over several decades, the academic community
has achieved great success applying it to domains such as program
optimization, bug detection, and program synthesis. However, directly
applying traditional static analysis to industrial-scale software is
challenging for two primary reasons.

First, a fundamental limitation of value-flow analysis is the computational 
cost of instantiating function summaries. This process can become 
excessively time- and memory-intensive, rendering the analysis inefficient 
for real-world software. Second, Objective-C, the primary language for iOS 
development, complicates static analysis with its dynamic 
features—particularly message dispatch and generic types. These 
features obscure call sites and field accesses, which are prerequisites for 
static analysis. This thesis presents contributions that improve the 
scalability and precision of value-flow analysis, enabling its practical 
application to industrial systems.

Our first contribution focuses on boosting the performance of value-flow
analysis. We observed that the primary bottleneck is redundant computation,
where summaries are blindly generated for functions regardless of their
relevance to the specific properties being checked. To address this, we
present the first approach to effectively identify and eliminate these
redundant summaries. This technique reduces the volume of collected
information from callee functions without compromising the soundness or
precision of the analysis. Our resulting tool has been deployed to analyze
thousands of applications at Ant Group, a dominant FinTech system that
handles billions of requests per day.

Our second contribution equips value-flow analysis with the ability to
handle the dynamic features of Objective-C. We introduce OCGBOOT, the first
call graph construction algorithm for Objective-C that systematically
leverages two key language design aspects: accessor methods and the
prohibition of method overloading. This enables OCGBOOT to effectively
infer type information for generic objects and establish initial field
accesses and calling relationships. OCGBOOT has been deployed in production
at both ByteDance and Ant Group, where it has enabled the discovery of
thousands of privacy violations from misused private APIs.


Date:                   Thursday, 26 September 2025

Time:                   3:00pm - 5:00pm

Venue:                  Room 2611
                        Lifts 31/32

Committee Members:      Prof. Charles Zhang (Supervisor)
                        Dr. Jiasi Shen (Chairperson)
                        Dr. Lionel Parreaux