More about HKUST
Precise and Scalable Static Bug Finding for Industrial-Sized Code
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Precise and Scalable Static Bug Finding for Industrial-Sized Code" By Mr. Qingkai SHI Abstract: Software bugs cost developers and software companies a great deal of time and money. Although previous work has reported many success stories for using static bug-finding tools, it is still difficult to find bugs hidden behind sophisticated pointer operations, deep calling contexts, and complex path conditions with a low false-positive rate, while sieving through millions of lines of code in just a few hours. In this thesis, we present our novel designs of sparse value-flow analysis to tackle a wide range of software bugs caused by improper value flows. The proposed approach has been commercialized and deployed in many of the global 500 companies. It also has reported hundreds of real bugs for open-source software systems. The first problem addressed in the thesis is referred to as the pointer trap: a precise points-to analysis limits the scalability of sparse value-flow analysis and an imprecise one seriously undermines its precision. To solve the problem, we present Pinpoint, a holistic approach that decomposes the cost of high-precision points-to analysis by precisely discovering local data dependence and delaying the expensive inter-procedural analysis through memorization. Such memorization enables the on-demand slicing and, thus, improves the scalability with high precision. Experiments show that Pinpoint can check millions of lines of code in less than five hours with a false positive rate lower than 30%. The second problem addressed in the thesis is known as the extensional scalability problem, which happens when we simultaneously check many value-flow properties with high precision. A major factor to this deficiency is that the core static analysis engine is oblivious of the mutual synergy among the properties being checked, thus inevitably losing many optimization opportunities. Our work is to leverage the inter-property awareness and to capture redundancies and inconsistencies when many properties are considered together. The evaluation results demonstrate that the approach, Catapult, is more than 8x faster than Pinpoint but consumes only 1/7 of the memory when checking twenty value-flow properties together. The third problem addressed in the thesis is how to improve the parallelism of static bug finding over the conventional parallel designs. Conventionally, bottom-up program analysis has been easy to parallelize because functions without caller-callee relations can be analyzed independently. However, functions with caller-callee relations have to be analyzed sequentially because the analysis of a function depends on the analysis results of its callees. We present Cheetah, a framework of bottom-up analysis, in which the analysis task of a function is partitioned into multiple sub-tasks. These sub-tasks are pipelined and run in parallel, even though the caller-callee relations exist. The evaluation results of Cheetah demonstrate significant speedup over a conventional parallel design. Date: Thursday, 23 April 2020 Time: 10:00am - 12:00noon Zoom Meeting: https://hkust.zoom.us/j/894501379 Chairman: Prof. Qinglu ZENG (OCES) Committee Members: Prof. Charles ZHANG (Supervisor) Prof. Shing-Chi CHEUNG Prof. Shuai WANG Prof. Weichuan YU (ECE) Prof. Xiangyu ZHANG (Purdue University) **** ALL are Welcome ****