More about HKUST
A Survey on Learned Data Structures for Database Management
PhD Qualifying Examination Title: "A Survey on Learned Data Structures for Database Management" by Mr. Siyuan HAN Abstract: In the past few years, machine-learning and deep-learning techniques have radically reshaped how we think about computation. At the same time, hardware advances—richer SIMD instruction sets, GPUs, TPUs, and other accelerators—have made it practical to trade extra compute for faster query processing and leaner storage footprints. Together, these trends are especially powerful for one of the oldest problems in data management: indexing. Recent empirical studies show that learned indexes consistently outperform classical structures in space-time trade-offs. In this survey, we review classical indexing methods in data management, while examining the strengths of learned indexes, including their updatable variants and construction efficiency. Furthermore, in the context of emerging large language models and vector databases, we explore recent advancements in learned indexes for vector compression, such as piecewise linear approximation-based techniques for product quantization and multi-dimensional extensions. Through this overview, we highlight opportunities for future research in scalable, resource-efficient data management systems based on learned methods. Date: Friday, 19 September 2025 Time: 10:00am - 11:00am Venue: Room 5501 Lifts 25/26 Committee Members: Prof. Lei Chen (Supervisor, Chairperson) Dr. Shuai Wang Prof. Qian Zhang