More about HKUST
Natural Language Decomposition for Graph-based Retrieval-Augmented Generation
The Hong Kong University of Science and Technology Department of Computer Science and Engineering Final Year Thesis Oral Defense Title: "Natural Language Decomposition for Graph-based Retrieval-Augmented Generation" by TANG Chiu Yeung Abstract: Graph-based Retrieval-Augmented Generation (RAG) systems often struggle with deixis and anaphora in unstructured text, producing incomplete knowledge graphs that degrade retrieval performance. This project investigates whether natural language decomposition can resolve these ambiguities by rewriting context-dependent expressions into explicit, self-contained statements before knowledge graph construction. Using AutoSchemaKG as the base framework, we integrated two decomposition techniques and evaluated them on a 200-sample multi-hop QA dataset and a small high-anaphora subset extracted from OntoNotes Release 5.0. Results show that decomposition consistently improves end-to-end RAG retrieval quality, especially on the high-anaphora corpus, but underperforms in extracted relation-triple accuracy, suggesting that decomposition benefits overall graph utility more than local precision. The study shows that pragmatic language processing techniques can meaningfully enhance graph-based RAG under realistic, context-heavy conditions, though findings are limited by the small size of the final corpus. Date : 30 April 2026 (Thursday) Time : 09:40 - 10:30 Venue : Room 2132C (near Lift 19), HKUST Advisor : Dr. SONG Yangqiu 2nd Reader : Prof. WONG Raymond Chi-Wing