More about HKUST
Analyzing Data with Regret Minimization Query
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Analyzing Data with Regret Minimization Query" By Miss Min XIE Abstract Extracting interesting tuples from a large database is an important problem in multi-criteria decision making. Two representative queries were proposed in the literature: top-k queries and skyline queries. A top-k query requires users to specify their utility functions beforehand and then returns k tuples to the users. A skyline query does not require any utility function from users but it puts no control on the number of tuples returned to users. Recently, a regret minimization query was proposed and received attention from the community because it does not require any utility function from users and the output size is controllable, and thus, it avoids those deficiencies of top-k queries and skyline queries. Specifically, it returns a small set of tuples such that any user’s favorite tuple in this subset is guaranteed to be not much worse than his/her favorite tuple in the whole database. In a sense, this query finds a small set of tuples that makes the user happy (i.e., not regretful) even if s/he gets the best tuple in the selected set but not the best tuple among all tuples in the database. In this thesis, we first study two versions of the regret minimization query: (1) the min-error version: we want to minimize the regret level of each user while guaranteeing the output size is at most k where k is the maximum output size. This problem is also known as the k-regret query, and (2) the min-size version: we want to determine the minimum number of tuples needed to keep users happy at a given level. We term this problem as the a-happiness query where a represents a happiness threshold. In both versions, we quantify the user’s regret level (and happiness level) by a criterion, called the regret ratio (and the happiness ratio). A small regret ratio (i.e., a large happiness ratio) indicates the user is satisfied with the set returned. In addition, we study how to enhance the regret minimization query with user interactions: when presented with a small number of tuples (which can be artificial tuples or true tuples inside the database), a user is asked to indicate the tuple s/he favors the most among them. In particular, we are also interested in the special case of determining the favorite tuple for a user in the entire database with a small amount of interaction, measured by the number of questions we ask the user. Finding the optimal solution for a regret minimization query is an NP-hard problem. We derive efficient approximate solutions for both versions of regret minimization. We also present a generic framework for interactive regret minimization, under which we propose algorithms that ask an asymptotically optimal number of questions in 2-dimensional spaces and algorithms with provable performance guarantees in d-dimensional spaces (d ≥ 2) where each dimension corresponds to a description of a tuple. We performed extensive experiments using both real and synthetic datasets. Our evaluations show that our algorithms outperform the best-known previous approaches in three ways: (i) they answer the regret minimization queries by guaranteeing a smaller regret ratio and returning fewer tuples to users, (ii) they do so much faster (up to two orders of magnitude times improvement), and (iii) they outperform the existing interactive regret minimization algorithms by locating the favorite tuple and guaranteeing a small regret ratio with much fewer questions. Date: Monday, 5 August 2019 Time: 10:00am - 12:00noon Venue: Room 3494 Lifts 25/26 Chairman: Prof. Shaohui Zheng (ISOM) Committee Members: Prof. Raymond Wong (Supervisor) Prof. Shing-Chi Cheung Prof. Xiaojuan Ma Prof. Henry Lam (CBE) Prof. Reynold Cheng (HKU) **** ALL are Welcome ****