More about HKUST
Multidimensional Partitioning for Histogram Using Reinforcement Learning
MPhil Thesis Defence Title: "Multidimensional Partitioning for Histogram Using Reinforcement Learning" By Mr. Weizhen DING Abstract Histograms, as a crucial tool in database management and big data analytics, are used for selectivity estimation to get approximate query results efficiently. Minskew is a state-of-the-art multi-dimensional histogram algorithm that offers decent accuracy in selectivity estimation, but it has difficulty in achieving good performance for queries of different sizes. To solve this problem, we propose Grid Adaptive Minskew and Reinforcement Learning (RL) Histogram. While Minskew generates a set of global grid cells that are used for splitting at all levels, Grid Adaptive Minskew generates grid cells for each bucket independently, resulting in histograms with buckets of more flexible sizes and improved accuracy for various selectivities. RL Histogram shares the same data structure with Grid Adaptive Minskew. It first learns the knowledge from Grid Adaptive Minskew and then explores histograms with policy gradient learning algorithm. The experiments demonstrate that Grid Adaptive Minskew outperforms Minskew in all aspects, and RL Histogram can further improve the performance. Date: Monday, 30 January 2023 Time: 3:00pm - 5:00pm Venue: Room 5501 lifts 25/26 Committee Members: Prof. Dimitris Papadias (Supervisor) Prof. Xiaofang Zhou (Chairperson) Prof. Nevin Zhang **** ALL are Welcome ****