From Single-Vector to Multi-Vector Retrieval: Toward Efficient and Scalable Indexing Methods

PhD Qualifying Examination


Title: "From Single-Vector to Multi-Vector Retrieval: Toward Efficient and 
Scalable Indexing Methods"

by

Mr. Zhoujin TIAN


Abstract:

Multi-vector retrieval has emerged as a powerful paradigm for semantic search, 
where queries and data objects are represented by sets of embeddings rather 
than single vectors. By preserving fine-grained query--document interactions, 
multi-vector models significantly improve retrieval quality across many 
applications. However, this increased expressiveness comes with substantially 
higher storage overhead, computational cost, and system design complexity. In 
particular, set-level similarity functions are often non-metric, require 
expensive pairwise interactions, and complicate efficient indexing and 
pruning. To address these challenges, recent work has explored diverse 
approaches to scalable multi-vector retrieval, including keyword-style 
indexing based on token-level pruning, similarity approximation using 
efficient surrogates, and native graph-based index structures that directly 
operate on vector sets. Each line of work involves distinct trade-offs 
between retrieval accuracy, efficiency, and scalability. This survey provides 
a comprehensive overview of recent advances in multi-vector search indexing. 
We systematically analyze the core design principles behind existing methods, 
discuss their strengths and limitations from both algorithmic and system 
perspectives, and outline promising directions for future research in 
multi-vector indexing and vector search systems.


Date:                   Tuesday, 3 February 2026

Time:                   9:00am - 11:00am

Venue:                  Room 2132C
                        Lift 22

Committee Members:      Prof. Xiaofang Zhou (Supervisor)
                        Prof. Raymond Wong (Chairperson)
                        Prof. Ke Yi