More about HKUST
Transferable Bandit
PhD Thesis Proposal Defence Title: "Transferable Bandit" by Mr. Bo LIU Abstract: The booming development of Artificial Intelligence promotes a large number of online interactive services including recommender system (RecSys), dialogue system, etc. These services require the intelligent algorithms to decide actions sequentially and to maximize the cumulative user feedbacks. To accomplish this goal, the algorithms are expected to simultaneously exploit and explore the user interests according to the partial and noisy feedback. Bandit is widely used to formulate the exploration-exploitation tradeoff in interactive services. When facing the insufficient observations, the bandit policies explores more than needed, which can lead to worse short-term rewards. In this proposal, we study a novel problem: Transferable bandit. Transferable bandit adopts transfer learning to leverage prior knowledge from the source domains with sufficient observations to further maximize the cumulative rewards in the target domain of interest. Transferable bandit harness the collective and mutually reinforcing power of bandit formulation and transfer learning. First, transfer learning improves the exploitation of a bandit policy and accelerates its exploration in the target domain. Second, the bandit policy explores and speeds up the knowledge transfer. We propose to address two critical challenges of the transferable bandit. First, we propose the Transfer Contextual Bandit (TCB) policy to bridge the action and context heterogeneity. Second, we present Lifelong Contextual Bandit (LCB) policy that sequentially transfers knowledge and maximizes the overall cumulative rewards. In this proposal, all algorithms are based a general framework: 1). How the rewards are generated concerning how the domains are related; 2). How to estimate and exploit the reward parameters and knowledge transfer; 3). How to measure and then explore the uncertainty of reward parameters and knowledge transfer. Both empirical studies on real-world datasets and theoretical analysis validate this proposal. Date: Monday, 11 December 2017 Time: 4:30pm - 6:30pm Venue: Room 5501 (lifts 25/26) Committee Members: Prof. Qiang Yang (Supervisor) Prof. Lei Chen (Chairperson) Dr. Qiong Luo Prof. Nevin Zhang **** ALL are Welcome ****