Conjunctive Query Evaluation under Updates

PhD Thesis Proposal Defence


Title: "Conjunctive Query Evaluation under Updates"

by

Mr. Binyang DAI


Abstract:

Conjunctive queries (CQs), which capture the select-project-join-aggregate
(SPJA) workloads, is a fundamental query class in database theory. In the
era of big data, numerous applications now continuously ingest streaming
data from upstream sources and require dynamic updates to query results.
This shift has highlighted the need for efficient conjunctive query
evaluation under updates.

In industry, this has driven the development of open-source stream
processing engines. In academia, the problem is formalized as incremental
view maintenance (IVM), with numerous IVM techniques and prototype systems
developed in recent years. Despite this progress, existing frameworks still
suffer from fundamental limitations. This thesis proposes novel algorithms
to overcome several of these limitations.

First, we observe that the join operator is the source of super-linear
costs in classical change propagation frameworks. Building on this insight,
we introduce a new change propagation framework that systematically
eliminates joins from the query plan. The resulting approach inherently
avoids super-linear costs while supporting constant-delay enumeration of
both full and delta results.

Second, we demonstrate that by allowing a small approximation error,
aggregation queries can be maintained efficiently. We present a new
approximate query processing (AQP) algorithm that can maintain any
free-connex aggregation query in amortized logarithmic time per update over
insertion-only input streams. Although known lower bounds prevent the same
logarithmic guarantee for fully dynamic streams, our algorithm demonstrates
excellent practical performance on both insertion-only and fully dynamic
workloads.

Finally, we investigate the problem of maintaining a uniform random sample
over the result of a conjunctive query under updates. While the problem is
trivial without joins, the potential blowup in join size makes the problem
significantly harder. We propose a new sampling algorithm that achieves
near-linear update time and linear space complexity, offering a practical
and theoretically efficient solution.


Date:                   Tuesday, 16 December 2025

Time:                   2:30pm - 4:30pm

Venue:                  Room 2128A
                        Lift 19

Committee Members:      Prof. Ke Yi (Supervisor)
                        Prof. Dimitris Papadias (Chairperson)
                        Prof. Qiong Luo