More about HKUST
Efficient Algorithms for Context Sensitive Points-to Analysis
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Efficient Algorithms for Context Sensitive Points-to Analysis" By Mr. Xiao XIAO Abstract Context sensitive points-to analysis, while significantly benefiting many static analysis techniques, is known to be difficult to scale to large programs. To make context sensitive points-to analysis practical to analyze contemporary software, which is commonly going up to millions lines of code, this dissertation explores the use of encoding techniques to speedup the computation and querying of points-to information. The first part of this dissertation introduces a novel technique, geometric encoding, to effectively capture the redundancy in representing a large number of calling contexts. Geometric encoding is capable of evaluating contexts of points-to constraints in the compressed form, but incurring much less space and time consumption compared to other compressing techniques such as BDD. Based on geometric encoding, an efficient context sensitive points-to engine GeomPTA is designed and implemented. GeomPTA can analyze large Java programs with JDK 1.6 library 7.1x – 81.9x faster than the worklist based 1-object-sensitive analysis in Paddle, and meanwhile its precision is comparable or better than 1-object-sensitive analysis. Our implementation of GeomPTA is now part of the official distribution of Soot, which is a widely used framework for analyzing Java programs. Since GeomPTA is fully context sensitive, GeomPTA can answer k-CFA points-to queries for any constant k. This dissertation also describes a novel contexts-go-by query, where the specified call edge is not restricted to the last k call edges ascending from the enclosing function of the querying pointer. This new query can effectively identify 92.4% redundant instrumentations for a call trace profiling tool and reduce its runtime overhead by 7.2%. Both the k-CFA and contexts-go-by queries can be efficiently answered with the help of geometric encoding. Although GeomPTA is much faster than other points-to algorithms, analyzing large programs still needs tens of minutes and sometimes the memory usage exceeds the capacity of a commodity machine. Moreover, some queries, especially those require aliasing information, cannot be answered efficiently merely with the points-to information. Therefore, the second part of this dissertation describes Pestrie, a persistent technique for compressing and reusing points-to and aliasing information. Also, the persistent information is structured for efficiently answering queries. Empirically, Pestrie produces 10.5x – 17.5x smaller persistent files and is 2.9x – 123.6x faster than points-to results intersection based approach to answer aliasing queries. Date: Monday, 4 January 2016 Time: 2:00p.m. - 4:00p.m. Venue: Room 3584 Lifts 27/28 Chairman: Prof. Matthew McKay (ECE) Committee Members: Prof. Charles Zhang (Supervisor) Prof. Shing-Chi Cheung Prof. Fangzhen Lin Prof. Jiang Xu (ECE) Prof. Thomas Reps (Comp. Sci., U. of Wisconsin) **** ALL are Welcome ****