More about HKUST
Automatic Localization of Code Omission Faults: a Context Pattern based Approach
PhD Thesis Proposal Defence Title: "Automatic Localization of Code Omission Faults: a Context Pattern based Approach" by Mr. Xinming Wang Abstract: Debugging is a tedious and expensive activity in software maintenance. To ease debugging, researchers have proposed various dynamic analysis techniques that aid developers in locating the program defects based on passed and failed test runs. In this proposal, we study a prevalent type of defects called code omission faults, which involve missing code and cannot be ascribed to any program entity that exists in the program. Code omission faults pose two major challenges to existing dynamic analysis techniques. First, missing code offers no execution behavior to observe directly. Second, even when the place of omission could have been accurately pinpointed, it is still difficult for developers to conjecture the missing code. To address these challenges of code omission, we conducted an empirical study to characterize real-world omission faults. Among 4966 non-trivial bug fixes extracted from a large open source project GNU/GCC, we found that 57% of them correspond to omission faults, which can be further categorized into 11 sub-types. Among them, we discovered that some sub-types, such as missing-assignment or missing-return, often induce certain dynamic control flows and data flows patterns when they trigger program failures. We express these patterns in an event flow language and show that by matching them against program execution, it is possible to accurately locate these sub-types of omission faults and infer part of their missing code. The remaining sub-types of omission faults, such as missing-condition and missing-branch, are more complex and induce patterns that vary with the missing code, which is mostly unknown at the time of debugging. For these complex omission faults, we empirically observed that the missing code likely appear elsewhere in a similar or even the same form. Inspired by this observation, we propose an approach that systematically removes different parts of the program and infers control-flow and data-flow patterns that are associated with these pieces of artificially removed code. We show that these patterns are useful to the localization of complex omission faults. We evaluate our approach using real world omission faults and mutation analysis on five real world programs: gcc, space, grep, and bc. Date: Wednesday, 12 May 2010 Time: 10:00am - 12:00noon Venue: Room 4480 lifts 25/26 Committee Members: Dr. Shing-Chi Cheung (Supervisor) Dr. Sunghun Kim (Chairperson) Dr. Lin Gu Dr. Charles Zhang **** ALL are Welcome ****