More about HKUST
Deep Reinforcement Learning in Urban Computing
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Deep Reinforcement Learning in Urban Computing" By Miss Yexin LI Abstract Nowadays, urban systems are widely deployed in many major cities, e.g. ride-sharing system, express system, take-out food delivering system, emergency medical service system, etc. Having modernized and facilitated the daily life of citizens significantly, these systems are facing severe operation challenges. For example, how to match passengers to drivers in a ride-sharing system, how to dispatch couriers in real time in an express system, etc. Previously, operation problems in urban systems are often tackled by methods in operation research, e.g. Optimization, or heuristic algorithms based on practical system settings. For an urban system, as we often want to generate a sequence of real-time actions to maximize the total reward in a long time, reinforcement learning is a proper choice. Besides, as the system is often large and complex, deep learning methods are necessary to better capture its representative and enriched features. In this thesis, we investigate how Deep Reinforcement Learning, i.e. DRL, can effectively learn operation policies for urban systems. For an urban system, according to how it operates, Central-Agent Reinforcement Learning, i.e. CARL, or Multi-Agent Reinforcement Learning, i.e. MARL, can be chosen to describe its operation process. For a system whose operation is described by CARL, we focus on how to properly formulate the problem and design each component of the model, i.e. the state, action, and immediate reward, thus to optimize the final target of the system. We adopt the take-out food system as an example and propose a Deep Reinforcement Order Packing model, i.e. DROP, to solve the operation problem in it. For a system whose operation can be described by MARL, besides designing each component of the model, we also try to guarantee that agents in the system cooperate with each other properly. We adopt the express system as an example, where there are many couriers working it, and propose a Deep Reinforcement Courier Dispatching model, i.e. DRCD, to solve the operation problem in it. DRCD can guarantee the cooperation among couriers to some extent but not globally, therefore, we further propose a Cooperative Multi-Agent Reinforcement Learning model, i.e. CMARL, to guarantee the cooperation among couriers globally by incorporating another Markov Decision Process along the agent sequence. Experiments based on real-world data are conducted to confirm the superiority of DROP, DRCD, and CMARL, compared with baselines. In MARL, besides cooperation among agents, competition also exists, although it is not common in modern urban systems. We briefly discuss about this scenario at the end to make the thesis complete. Date: Wednesday, 11 November 2020 Time: 2:00pm - 4:00pm Zoom Meeting: https://hkust.zoom.com.cn/j/98029257612?pwd=RTkvOVJSbGU2dnpseWtnQ2NndVJJZz09 Chairperson: Prof. Kani CHEN (MATH) Committee Members: Prof. Qiang YANG (Supervisor) Prof. Kai CHEN Prof. Qiong LUO Prof. Hai YANG (CIVL) Prof. Jiannong CAO (PolyU) **** ALL are Welcome ****