More about HKUST
REPRODUCING SHARED-MEMORY CONCURRENCY BUGS FOR BUG DIAGNOSTICS
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "REPRODUCING SHARED-MEMORY CONCURRENCY BUGS FOR BUG DIAGNOSTICS" By Mr. Jinguo ZHOU Abstract Concurrency bug is a major stumbling block to writing multi-threaded programs. The capability of reproducing a concurrency bug is key to comprehending and finally fixing the bug. Unlike sequential programs, concurrent programs may behave differently even given the same input, making concurrency bugs much harder to be reproduced than bugs caused within a single thread. This thesis targets on the reproduction of shared-memory concurrency bugs, together with new techniques for the bug diagnostics given the capability of reproducing the bug. First, we contribute Stride, a deterministic replay technique for concurrent programs, that improves the efficiency of the recording phase, which is key to the practicability of replay techniques. Stride records the bounded read-write linkage instead of the exact linkage, which eliminates synchronization requirements on read operations during the recording phase and is still able to synthesize a legal execution in polynomial time. Comparing to the previous state-of-the-art approaches of deterministic replay, Stride is on average 2.5 times faster. Second, we handle concurrency bug reproduction under two special scenarios with special requirements. One is the bug reproduction for relaxed memory model(RMM) specific behaviors, the challenge of which is the Heisenberg effects caused by the code instrumentation incurred by bug reproduction techniques (Some RMM specific behaviors cannot occur in the instrumented program), which leads to a poor coverage. We contribute SC-Transformation, the first technique that enables safe instrumentation under RMMs with the provable guarantee of full coverage, i.e., instrumentations on the program do not prevent RMM-specific behaviors. Therefore, we can apply all kinds of instrumentations to deterministically reproduce RMM-specific behaviors. Experiments show that, supported by SC-Transformation, we successfully recorded all of the seventy-one RMM-specific bug traces we tested, none of which could be captured atop naive program instrumentation. The second special scenario is the bug reproduction on the production runs on I/O intensive programs, the challenge of which is the long running time and the large input size. We contribute POINT, which departs from the conventional philosophy of reproducing a similar execution by going a new direction of reproducing only the interleaving windows occurring among a few lines of code. Being more focused on the source of concurrency bugs, POINT requires no duplication of the user inputs and supports programs running for months. Finally, we have developed a method that deterministically pinpoints the root cause of concurrency bugs given the capability of reproducing the bug, which is achieved by deterministically isolating the effects of the context-switches within the buggy execution. Date: Thursday, 17 March 2016 Time: 2:00pm – 4:00pm Venue: Room 4504 Lifts 25/26 Chairman: Prof. Zongjin Li (CIVL) Committee Members: Prof. Charles Zhang (Supervisor) Prof. Shing-Chi Cheung Prof. Fangzhen Lin Prof. Maosheng Xiong (MATH) Prof. Grigore Rosu (U of Illinois At Urbana-Champaign) **** ALL are Welcome ****