More about HKUST
Toward Trustworthy Reasoning in Large Language Models: Data, Exploration, and Feedback
PhD Qualifying Examination Title: "Toward Trustworthy Reasoning in Large Language Models: Data, Exploration, and Feedback" by Mr. Yuzhen HUANG Abstract: Reasoning stands at the core of intelligence, enabling agents to solve problems, plan ahead, and interact with tools. Recent advances in large language models (LLMs) have brought this long-standing goal of artificial intelligence within reach. Models such as OpenAI-o1 and DeepSeek-R1 demonstrate strong reasoning capabilities across mathematics, programming, and agentic tasks, approaching expert-level performance in many benchmarks. This survey reviews the methods driving these advances, with a focus on two complementary paradigms: (1) learning from external data, which enriches models through large-scale curated corpora and supervised fine-tuning on reasoning-intensive tasks; and (2) learning from exploration and feedback, which harnesses self-improvement loops, reinforcement learning, and verifiable rewards to refine reasoning strategies through interaction. Together, these approaches have propelled LLMs from passive imitation to active problem solving, long-horizon planning, and tool-using behavior. We further review key benchmarks, highlight challenges, including robust generalization, reliable tool integration, and discuss the emerging shift toward general purpose agents and frontier scientific discovery. By unifying data-driven and exploration driven paradigms, this survey outlines the path toward frontier reasoning models capable of solving challenging problems and contributing to open scientific and engineering challenges. Date: Friday, 17 October 2025 Time: 10:00am - 12:00noon Venue: Room 3494 Lifts 25/26 Committee Members: Dr. Junxian He (Supervisor) Dr. Yangqiu Song (Chairperson) Dr. May Fung