More about HKUST
Analyzing Data with Regret Minimization Query
PhD Thesis Proposal Defence Title: "Analyzing Data with Regret Minimization Query" by Miss Min XIE Abstract: Extracting interesting tuples from a large database is an important problem in multi-criteria decision making. Two representative queries were proposed in the literature: top-k queries and skyline queries. A top-k query requires users to specify their utility functions beforehand and then returns k tuples to the users. A skyline query does not require any utility function from users but it puts no control on the number of tuples returned to users. Recently, a regret minimization query was proposed and received attention from the community because it does not require any utility function from users and the output size is controllable, and thus it avoids those deficiencies of top-k queries and skyline queries. Specifically, it returns a small set of tuples such that any user's favorite tuple in this subset is guaranteed to be not much worse than his/her favorite tuple in the whole database. In a sense, this query finds a small set of tuples that makes the user happy (i.e., not regretful) even if s/he gets the best tuple in the selected set but not the best tuple among all tuples in the database. In this paper, we study two variations of the regret minimization query: (1) the min-error variation: we want to minimize the regret level of each user while guaranteeing the output size is at most k where k is the maximum output size. This problem is also known as the k-regret query, and (2) the min-size variation: we want to determine the least tuples needed to keep users happy at a given level. We term this problem as the alpha-happiness query where alpha represents a happiness threshold. In both variations, we quantify the user's regret level (and happiness level) by a criterion, called the regret ratio (and the happiness ratio). A small regret ratio (i.e., a large happiness ratio) indicates the user is satisfied with the set returned. As this is an NP-hard problem, we derive efficient approximate solutions for both variations. We performed extensive experiments using both real and synthetic datasets. Our evaluations show that our algorithms outperform the best-known previous approaches in two ways: (i) they answer the regret minimization queries by guaranteeing a smaller regret ratio and returning fewer tuples to users and, (ii) they do so much faster (up to two orders of magnitude times improvement). Date: Monday, 25 February 2019 Time: 12:30pm - 2:30pm Venue: Room 4472 (lifts 25/26) Committee Members: Dr. Raymond Wong (Supervisor) Prof. Shing-Chi Cheung (Chairperson) Prof. Lei Chen Dr. Wilfred Ng **** ALL are Welcome ****