Natural Language Decomposition for Graph-based Retrieval-Augmented Generation

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

Final Year Thesis Oral Defense

Title: "Natural Language Decomposition for Graph-based Retrieval-Augmented 
Generation"

by

TANG Chiu Yeung

Abstract:

Graph-based Retrieval-Augmented Generation (RAG) systems often struggle with 
deixis and anaphora in unstructured text, producing incomplete knowledge 
graphs that degrade retrieval performance. This project investigates whether 
natural language decomposition can resolve these ambiguities by rewriting 
context-dependent expressions into explicit, self-contained statements before 
knowledge graph construction. Using AutoSchemaKG as the base framework, we 
integrated two decomposition techniques and evaluated them on a 200-sample 
multi-hop QA dataset and a small high-anaphora subset extracted from 
OntoNotes Release 5.0. Results show that decomposition consistently improves 
end-to-end RAG retrieval quality, especially on the high-anaphora corpus, but 
underperforms in extracted relation-triple accuracy, suggesting that 
decomposition benefits overall graph utility more than local precision. The 
study shows that pragmatic language processing techniques can meaningfully 
enhance graph-based RAG under realistic, context-heavy conditions, though 
findings are limited by the small size of the final corpus.

Date            : 30 April 2026 (Thursday)

Time            : 09:40 - 10:30

Venue           : Room 2132C (near Lift 19), HKUST

Advisor         : Dr. SONG Yangqiu

2nd Reader      : Prof. WONG Raymond Chi-Wing