More about HKUST
Enhancing Language Models for Program Synthesis Using Execution
Speaker: Ansong Ni Yale University Title: "Enhancing Language Models for Program Synthesis Using Execution" Date: 27 March 2023 Time: 9:00am - 10:00am HKT Join Zoom Meeting: https://hkust.zoom.us/j/96335714626?pwd=azdvRnJTRm1BcFVUd1dEdWpReDhaZz09 Meeting ID: 963 3571 4626 Passcode: 894396 Abstract: Large language models (LLMs), especially those pretrained on code, have shown great ability of generating programs from natural language inputs. While such models are typically learned from the surface form, the semantics of programs are often evaluated with execution, which is not explicitly modeled by LLMs. This raises the question of how to better incorporate program semantics in LLMs for program synthesis. In this talk, I will introduce two of our recent works in answering this question. In the first work [1], we propose to let the model perform sampling during training and learn from those self-sampled correct or partially-correct programs, which are automatically identified by comparing the final or intermediate execution states. We show the effectiveness of this method in the domain of math reasoning. In our second work [2], we introduce LEVER, which learns to verify the LLM-generated programs with their execution results. The program candidates are then reranked by the joint probability of verification and generation. LEVER + Codex achieves new state-of-the-art results on four popular NL2Code tasks, and ablation studies show that execution information is crucial for the improvements. References: [1] "Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions." Ansong Ni, Jeevana Priya Inala, Chenglong Wang, Oleksandr Polozov, Christopher Meek, Dragomir Radev, Jianfeng Gao. ICLR'23 [2] "LEVER: Learning to Verify Language-to-Code Generation with Execution." Ansong Ni, Srini Iyer, Dragomir Radev, Ves Stoyanov, Wen-tau Yih, Sida I. Wang, Xi Victoria Lin. arXiv'23 ***************** Biography: Ansong Ni is currently a 3rd CS PhD student at Yale University, working with Prof. Dragomir Radev. His main research interests are in the intersections of natural language processing (NLP), machine learning (ML) and programming languages (PL). His long-term research goal is to build systems that can understand user intent and parse them into machine-executable form. Motivated by this goal, he conducts research on program synthesis, semantic parsing, and neuro-symbolic methods. Previously, he obtained his M.S. in CS degree from Carnegie Mellon University and he also worked as a research intern at AI2 (2020), MSR (2021), and FAIR (2022).