More about HKUST
Structural Augmented Reasoning for Language Models
PhD Thesis Proposal Defence
Title: "Structural Augmented Reasoning for Language Models"
by
Mr. Yubo WANG
Abstract:
Large language models have achieved strong performance across a wide range of
natural language understanding and generation tasks. However, their reasoning
often relies on shallow pattern matching over unstructured inputs, limiting
their effectiveness in scenarios that demand external domain knowledge,
fine-grained entity-level understanding, or rapid adaptation to evolving task
requirements with minimal supervision. This dissertation investigates this
problem through the lens of structural augmented reasoning: the integration
of explicit structured representations—such as knowledge graphs and
dynamically constructed weighted graphs—into language model pipelines to
enable more precise, context-aware, and adaptable reasoning.
The dissertation develops this perspective through two concrete problem
settings. First, it addresses knowledge-intensive tabular understanding,
where language models lack sufficient semantic context to accurately
interpret table columns. It presents KGLink, a hybrid framework that bridges
knowledge graph information with pre-trained language models, resolving the
granularity mismatch between knowledge graph-derived types and
dataset-specific labels while compensating for missing contextual cues in
table content. Second, it tackles dynamic few-shot text classification in
social media domains, where target labels evolve over time and labeled data
remains scarce. It introduces GORAG, a graph-based online retrieval-augmented
generation framework that constructs and maintains an adaptive weighted graph
to achieve threshold-free, input-specific retrieval, ensuring that the
contextual information provided to the language model remains both
comprehensive and precise.
Taken together, these studies support a unified view: language models benefit
substantially from structured intermediate representations that organize,
filter, and prioritize external knowledge before it reaches the model. The
central contribution of this dissertation is to demonstrate that structural
augmentation—whether grounded in external knowledge graphs or dynamically
constructed from task-specific signals— provides a principled and effective
pathway for improving language model reasoning across diverse,
knowledge-intensive applications.
Date: Thursday, 22 May 2026
Time: 2:00pm - 4:00pm
Venue: Room 2128A
Lift 19
Committee Members: Prof. Lei Chen (Supervisor)
Dr. Dan Xu (Chairperson)
Dr. Chaojian Li