More about HKUST
UNDERSTANDING AND UTILIZING USER PREFERENCES
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "UNDERSTANDING AND UTILIZING USER PREFERENCES" By Miss Yu PENG Abstract With the rapid growth of web-based applications, mining personalized preferences for promotion becomes a hot topic. In this thesis, we focus on two problems related to user preferences: understanding user preferences and utilizing user preferences. In understanding user preferences, we propose two sub-problems when we consider temporal user preferences. The first sub-problem is called attribute-based subsequence matching (ASM) : given a query sequence and a set of sequences, considering the attributes of elements, we want to find all the sequences which are matched by this query sequence. We propose an efficient algorithm for problem ASM by applying the Chinese Remainder Theorem. The second sub-problem is to find all the attribute-based frequent subsequences. We adapt an existing efficient algorithm for this second subproblem to show that we can use the algorithm developed for the first sub-problem. Experimental results show that frequent subsequences reflect user preferences, and our algorithms are scalable in large datasets. This work can stimulate a lot of existing data mining problems which are fundamentally based on subsequence matching. In utilizing user preferences, we identify and tackle three sub-problems, finding top-k profitable products, finding top-k popular products, and finding K-dominating competitive price. The former two sub-problems are about designing new products, while the latter one is about pricing new products. In finding top-k profitable products, we consider generalized user preferences, derived from the skyline concept. We propose a dynamic programming approach which can find the optimal solution when there are two attributes to be considered. We show that this problem is NP-hard when there are more than two attributes. Two greedy algorithms are proposed and one of them has theoretical bounds. We also consider this problem on dynamic datasets and propose update algorithms for different update operations. We extend this problem by considering another form of customer preferences, namely tolerant customer preferences in finding top-k popular products. We prove that this problem is NP-hard and propose a 0.63-approximate algorithm for this problem. Extensive experiments show the effectiveness and efficiency of our approaches on both synthetic and real datasets. In finding K-dominating competitive price in which we consider generalized user preferences only, we propose an efficient algorithm. We utilize spatial properties for pruning to speed up our algorithm. An extensive performance study using both synthetic and real datasets is reported to verify its effectiveness and efficiency. We also provide a real case study to show how our algorithm works in reality. Date: Tuesday, 5 June 2012 Time: 2:00pm – 4:00pm Venue: Room 3501 Lifts 25/26 Chairman: Prof. Kwing Lam Chan (MATH) Committee Members: Prof. Raymond Wong (Supervisor) Prof. Lei Chen Prof. Nevin Zhang Prof. Weichuan Yu (ECE) Prof. Hua Lu (Comp. Sci., Aalborg Univ.) **** ALL are Welcome ****