Privacy-Preserved Learning on Graph-Structured Data

PhD Thesis Proposal Defence


Title: "Privacy-Preserved Learning on Graph-Structured Data"

by

Mr. Qi HU


Abstract:

Graph-structured data and graph learning techniques, particularly graph 
neural networks (GNN), have revolutionized numerous applications in recent 
years. These advancements enable sophisticated knowledge extraction from 
complex graph data, unlocking unprecedented opportunities for discovery and 
decision-making. However, the growing accessibility of graph data and the 
increasing complexity of graph learning models have raised significant 
privacy concerns that remain inadequately addressed despite ongoing research 
efforts.

This thesis investigates privacy preservation in graph learning models, 
focusing on developing effective and robust approaches to mitigate these 
concerns.

The first section examines privacy preservation in traditional graph learning 
models, specifically addressing graph embedding tasks. While graph 
embeddings, low-dimensional representations of graph-structured data, 
facilitate various downstream applications such as node classification and 
link prediction, they are vulnerable to attribute inference attacks. Existing 
privacy-preserving methods often rely on adversarial learning techniques and 
assume full access to sensitive attributes during training. However, this 
assumption is problematic given the diverse privacy preferences, and the 
adversarial learning approach often suffers from training instability. To 
address these limitations, we introduce a novel approach that incorporates 
the independent distribution penalty as a regularization term.

The second section explores privacy preservation in distributed learning 
environments. Although federated learning protects raw data by design, we 
show that learned embeddings remain vulnerable to privacy leakage attacks. 
Moreover, traditional privacy preservation approaches often fail to account 
for varying privacy preferences among participants, resulting in suboptimal 
trade-offs between utility and privacy. To address this, we propose a 
flexible framework that adaptively accommodates diverse privacy requirements 
while minimizing utility loss.

The final section addresses privacy challenges in neural graph databases 
(NGDBs), which integrate traditional graph databases with neural networks and 
can be integrated with large language models. Although NGDBs offer powerful 
capabilities for handling incomplete data through neural embedding storage 
and complex query answering (CQA), their ability to reveal hidden 
relationships introduces additional privacy vulnerabilities. Specifically, 
malicious actors may exploit carefully crafted queries to extract sensitive 
information. To mitigate these risks, we develop a privacy-preserved NGDB 
framework that mitigates these risks by increasing the complexity of 
inferring sensitive information through query combinations.


Date:                   Friday, 24 January 2025

Time:                   10:00am - 12:00noon

Venue:                  Room 3494
                        Lifts 25/26

Committee Members:      Dr. Yangqiu Song (Supervisor)
                        Dr. Wei Wang (Chairperson)
                        Dr. Dongdong She