More about HKUST
Audit-on-Demand Correctness for ANN-Based Vector Search: A Survey and Review of Zero-Knowledge Proof Techniques
PhD Qualifying Examination
Title: "Audit-on-Demand Correctness for ANN-Based Vector Search: A Survey and
Review of Zero-Knowledge Proof Techniques"
by
Mr. Zipeng QIU
Abstract:
Approximate nearest-neighbor (ANN) vector search has become a core systems
primitive for semantic search, recommendation, and retrieval-augmented
generation, yet current service interfaces typically return only a top-k list
rather than auditable evidence of how that list was produced. This survey
studies audit-on-demand correctness: after a disputed query result, can a
service prove that the returned ranking is exactly the output of a publicly
specified ANN workflow executed on a committed snapshot, while keeping the
underlying corpus and index state private? We focus on the regime most
relevant to dense retrieval over proprietary data: ANN execution is
outsourced, disputes are occasional rather than continuous, and both public
verifiability and limited disclosure matter.
The survey makes four contributions. First, it formulates audit-on-demand
correctness as a procedural guarantee: the object of certification is exact
compliance with a published ANN semantics, not exact nearest-neighbor
optimality in the raw corpus. Second, it organizes the design space through a
taxonomy spanning authenticated data structures, replicated re-execution,
trusted execution environments (TEEs), generic zero-knowledge (ZK) query
systems, and specialized ZK proofs for vector search. Third, it distills the
recurring proof-engineering patterns visible in current ZK approaches to ANN
auditing, especially the proof-oriented IVF-PQ-style pipeline that presently
offers the clearest end-to-end evidence, including fixed-shape semantics,
snapshot commitments, order-statistic proofs, lookup-friendly scoring, and
cost-aware parameter selection. Fourth, it analyzes V3DB as a case study of
that specialized route and uses it to identify the main open problems for a
thesis agenda: dynamic freshness, graph-based indices, private queries, hybrid
trust architectures, and end-to-end accountable retrieval.
Date: Monday, 27 April 2026
Time: 2:00pm - 4:00pm
Venue: Room 2126A
Lift 19
Committee Members: Dr. Binhang Yuan (Supervisor)
Dr. Shuai Wang (Chairperson)
Prof. Ke Yi