More about HKUST
On End-User Error Handling in Interactive Queries and Tables
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "On End-User Error Handling in Interactive Queries and Tables"
By
Mr. Qixu CHEN
Abstract:
Many applications designed for end-users rely on human-generated inputs, but
their performance can degrade significantly due to the inherent unreliability
of human operations, leading to undesirable outcomes. For instance, in
end-user decision-making process where the user needs to find the most
interesting tuple in a large dataset, the interactive queries require the
user to engage through a series of questions, each requiring him/her to
compare 2 tuples for choosing a more preferred one, to elicit the user’s
preference. The system then recommends tuples based on the learned
preference. However, even a single erroneous response from the user can
mislead the learning process, resulting in sub-optimal recommendations.
Similarly, for common end-user data processing tools such as Microsoft Excel
which requires human-generated relational tables as input for analysis,
errors in tables can cause degenerated model performance and flawed analysis.
Unlike enterprise-level settings, where domain experts and data governance
help detect and correct errors, end-user environments lack such resources,
making error handling a significant challenge.
This thesis addresses the problem of error handling in interactive queries
and end-user tables through three studies. In the first study, we tackle
random user errors in interactive queries and propose an algorithm that
minimizes the number of questions needed when each tuple has two attributes.
Then, for scenarios where each tuple is described by multiple attributes, we
propose two algorithms, both with provable performance guarantees. In the
second study, we develop techniques to handle both random and persistent user
errors, proposing two algorithms, one with asymptotically optimal round
efficiency, and the other with a small number of rounds empirically. In the
third study, we focus on table data cleaning in end-user scenarios, and
develop a framework to systematically catch errors using a novel class of
data-quality constraints that we call Semantic-Domain Constraints, which can
be automatically applied to any tables, without requiring domain experts to
manually specify on a per-table basis. Collectively, these contributions
advance error handling techniques for end-user applications, enhancing the
robustness of interactive queries and user tables in practical applications.
Date: Tuesday, 27 May 2025
Time: 3:00pm - 5:00pm
Venue: Room 2128A
Lift 19
Chairman: Dr. Eun Soon IM (CIVL)
Committee Members: Prof. Raymond WONG (Supervisor)
Prof. Dimitris PAPADIAS
Prof. Xiaofang ZHOU
Prof. Xueqing ZHANG (CIVL)
Prof. Kyriakos MOURATIDIS (SMU)