Toward Trustworthy Reasoning in Large Language Models: Data, Exploration, and Feedback

PhD Qualifying Examination


Title: "Toward Trustworthy Reasoning in Large Language Models: Data, 
Exploration, and Feedback"

by

Mr. Yuzhen HUANG


Abstract:

Reasoning stands at the core of intelligence, enabling agents to solve
problems, plan ahead, and interact with tools. Recent advances in large
language models (LLMs) have brought this long-standing goal of artificial
intelligence within reach. Models such as OpenAI-o1 and DeepSeek-R1
demonstrate strong reasoning capabilities across mathematics, programming,
and agentic tasks, approaching expert-level performance in many benchmarks.
This survey reviews the methods driving these advances, with a focus on two
complementary paradigms: (1) learning from external data, which enriches
models through large-scale curated corpora and supervised fine-tuning on
reasoning-intensive tasks; and (2) learning from exploration and feedback,
which harnesses self-improvement loops, reinforcement learning, and
verifiable rewards to refine reasoning strategies through interaction.
Together, these approaches have propelled LLMs from passive imitation to
active problem solving, long-horizon planning, and tool-using behavior. We
further review key benchmarks, highlight challenges, including robust
generalization, reliable tool integration, and discuss the emerging shift
toward general purpose agents and frontier scientific discovery. By unifying
data-driven and exploration driven paradigms, this survey outlines the path
toward frontier reasoning models capable of solving challenging problems and
contributing to open scientific and engineering challenges.


Date:                   Friday, 17 October 2025

Time:                   10:00am - 12:00noon

Venue:                  Room 3494
                        Lifts 25/26

Committee Members:      Dr. Junxian He (Supervisor)
                        Dr. Yangqiu Song (Chairperson)
                        Dr. May Fung

Privacy Sitemap

Toward Trustworthy Reasoning in Large Language Models: Data, Exploration, and Feedback

About

People

Research

Academics

Admissions