Semantic-Aware Android Crash Reproduction

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Semantic-Aware Android Crash Reproduction"

By

Miss Maryam Alsadat MASOUDIAN TARGHI


Abstract:

Reproducing reported crashes is a fundamental yet challenging task in Android
application (app) maintenance, due to the semantic gap between low-quality
crash reports and the complex, event-driven behaviour of modern apps. This
thesis presents a comprehensive, three-layered framework for identifying
necessary inputs, focusing on resolving issues related to missing semantic
context across the user interaction, device environment, and library layers.

First, we focus on the User Interface (UI) layer to improve crash
reproduction time and effectiveness by identifying the necessary UI elements.
We design Mole, a directed fuzzer that uses Attribute-Sensitive Reachability
Analysis (ASRA) to model dependencies between UI widgets and their states. By
statically modelling dependencies between UI widgets and visual states, Mole
identifies the necessary UI elements involved in the crash and prunes
irrelevant ones, using instrumentation in the app code. By precisely pruning
irrelevant event sequences, Mole significantly reduces reproduction time
compared to state-of-the-art fuzzers (Ape, Stoat, and FastBot2). Our
evaluation demonstrates that this approach achieves a success rate of over
70% in reproducing crashes on the Themis benchmark. Second, we attempt to
detect the effect of Android device configuration settings on crashes. We
propose Shark, a neuro-symbolic approach that identifies crash-triggering
device settings. By combining static program slicing with LLM-based
reasoning, Shark infers the necessary Steps to Configure (S2C) with high
precision, then leverages symbolic synthesis to validate these configurations
on the Android device. An evaluation of 135 reported crashes from GitHub
open-sourced Android app repositories shows that Shark achieves 70% recall
and 90% precision, requiring only 11.4 minutes per crash on average.

Finally, we introduce DAInfer+ to resolve data-flow opacity at API boundaries
during static analysis. Since directed-fuzzing solutions rely on static
analysis to infer reachable paths, they depend on the soundness of that
analysis. DAInfer+ automates the retrieval of API dataflow and aliasing
specifications from natural-language documentation using sentence-embedding
models, outperforming generative approaches. The results demonstrate that the
recall and precision for retrieving data flow API specifications using
embedding models exceed 82% and 85%, respectively. In contrast,
programming-trained LLMs could achieve only 75% recall, albeit with a higher
precision of 94%.

Furthermore, DAInfer+ achieves 88% recall and 79% precision in inferring
alias specifications using embedding models, while maintaining high
efficiency and returning results in only a few seconds.

Together, these contributions lay the foundation for constructing an
autonomous framework to detect semantic context, including the necessary
inputs from UI and device settings that affect a crash, based on a simple
stack trace.


Date:                   Tuesday, 21 April 2026

Time:                   4:00pm - 6:00pm

Venue:                  Room 2132C
                        Lift 22

Chairman:               Prof. Dennis Zhiming TAY (HUMA)

Committee Members:      Prof. Charles ZHANG (Supervisor)
                        Prof. Shing-Chi CHEUNG
                        Dr. Dimitris PAPADOPOULOS
                        Prof. Allen HUANG (ACCT)
                        Dr. Yuqun ZHANG (SUSTech)