More about HKUST
On End-User Error Handling in Interactive Queries and Tables
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "On End-User Error Handling in Interactive Queries and Tables" By Mr. Qixu CHEN Abstract: Many applications designed for end-users rely on human-generated inputs, but their performance can degrade significantly due to the inherent unreliability of human operations, leading to undesirable outcomes. For instance, in end-user decision-making process where the user needs to find the most interesting tuple in a large dataset, the interactive queries require the user to engage through a series of questions, each requiring him/her to compare 2 tuples for choosing a more preferred one, to elicit the user’s preference. The system then recommends tuples based on the learned preference. However, even a single erroneous response from the user can mislead the learning process, resulting in sub-optimal recommendations. Similarly, for common end-user data processing tools such as Microsoft Excel which requires human-generated relational tables as input for analysis, errors in tables can cause degenerated model performance and flawed analysis. Unlike enterprise-level settings, where domain experts and data governance help detect and correct errors, end-user environments lack such resources, making error handling a significant challenge. This thesis addresses the problem of error handling in interactive queries and end-user tables through three studies. In the first study, we tackle random user errors in interactive queries and propose an algorithm that minimizes the number of questions needed when each tuple has two attributes. Then, for scenarios where each tuple is described by multiple attributes, we propose two algorithms, both with provable performance guarantees. In the second study, we develop techniques to handle both random and persistent user errors, proposing two algorithms, one with asymptotically optimal round efficiency, and the other with a small number of rounds empirically. In the third study, we focus on table data cleaning in end-user scenarios, and develop a framework to systematically catch errors using a novel class of data-quality constraints that we call Semantic-Domain Constraints, which can be automatically applied to any tables, without requiring domain experts to manually specify on a per-table basis. Collectively, these contributions advance error handling techniques for end-user applications, enhancing the robustness of interactive queries and user tables in practical applications. Date: Tuesday, 27 May 2025 Time: 3:00pm - 5:00pm Venue: Room 2128A Lift 19 Chairman: Dr. Eun Soon IM (CIVL) Committee Members: Prof. Raymond WONG (Supervisor) Prof. Dimitris PAPADIAS Prof. Xiaofang ZHOU Prof. Xueqing ZHANG (CIVL) Prof. Kyriakos MOURATIDIS (SMU)