A Survey on Learned Data Structures for Database Management

PhD Qualifying Examination


Title: "A Survey on Learned Data Structures for Database Management"

by

Mr. Siyuan HAN


Abstract:

In the past few years, machine-learning and deep-learning techniques have 
radically reshaped how we think about computation. At the same time, hardware 
advances—richer SIMD instruction sets, GPUs, TPUs, and other 
accelerators—have made it practical to trade extra compute for faster query 
processing and leaner storage footprints. Together, these trends are 
especially powerful for one of the oldest problems in data management: 
indexing. Recent empirical studies show that learned indexes consistently 
outperform classical structures in space-time trade-offs.

In this survey, we review classical indexing methods in data management, 
while examining the strengths of learned indexes, including their updatable 
variants and construction efficiency. Furthermore, in the context of emerging 
large language models and vector databases, we explore recent advancements in 
learned indexes for vector compression, such as piecewise linear 
approximation-based techniques for product quantization and multi-dimensional 
extensions. Through this overview, we highlight opportunities for future 
research in scalable, resource-efficient data management systems based on 
learned methods.


Date:                   Friday, 19 September 2025

Time:                   10:00am - 11:00am

Venue:                  Room 5501
                        Lifts 25/26

Committee Members:      Prof. Lei Chen (Supervisor, Chairperson)
                        Dr. Shuai Wang
                        Prof. Qian Zhang