More about HKUST
Towards Bridging Human Requirements and Program Synthesis
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Towards Bridging Human Requirements and Program Synthesis" By Mr. Jiarong WU Abstract: Program synthesis aims to synthesize programs meeting the user-given specifications, such as input-output examples and natural language descriptions. Recent advancements in both symbolic and neural synthesis methodologies, along with improvements in computational hardware, have transformed this once formidable research challenge into practical techniques that facilitate billions of users. Despite the significant progress made, particularly in speed and scalability, program synthesizers still face challenges in faithfully realizing human requirements due to their intrinsic diversity, ambiguity, and complexity. This thesis aims to enhance the communication of human requirements to program synthesizers by studying the gap between them and proposing user-friendly interaction with efficient synthesis techniques to improve the usability and efficiency of program synthesizers. This thesis includes the following three studies. The first study introduces a programming-by-example (PBE) framework to ease PBE development in diverse domains. Domain-specific PBE synthesizers, such as Microsoft FlashFill in Excel, boost user efficiency by synthesizing programs to automate repetitive tasks in the application domains. However, the demanding requirement of implementing a PBE synthesizer hinders the adoption of PBE in wider application domains. Therefore, we propose Bee, a PBE framework with an innovative developer interface based on relational tables, to ease domain-specific customization. The evaluation shows Bee is more accessible to novice developers than the classic program synthesis frameworks while maintaining comparable performance. The second study enhances the probabilistic model used in interactive program synthesis for resolving ambiguity in PBE, where multiple programs consistent with the example specification can be found. Question-answer interaction with users can effectively identify the user-intended program, and the question selection problem aims to select the questions that minimize the interaction effort, specifically the expected value of interaction rounds. The measurement of the probabilities of example-consistent programs is key to the question selection performance. Therefore, this study proposes using large language models (LLMs) to capture user intentions in the natural language descriptions alongside examples to yield a more user-aligned probabilistic model. The evaluation investigates the characteristics of different probabilistic models and confirms the effectiveness of our approach. The third study empirically investigates the decoupling of code generation’s complexity into (a) problem-solving and (b) implementing a sketched solution in a programming language (PL). Using pseudocode to represent sketched solutions, we compare the code generation performance from pseudocode or problem descriptions in different PLs. Analysis of experimental results shows that (1) the complexity of problem-solving is the performance bottleneck, (2) the implementation complexity in different PLs has significant differences, and (3) pseudocode can preserve essential semantics of solutions and serve as an effective, PL-agnostic means of communication with LLMs. Date: Tuesday, 29 July 2025 Time: 3:30pm - 5:30pm Venue: Room 5501 Lifts 25/26 Chairman: Prof. David Edward COOK (ECON) Committee Members: Prof. Shing-Chi CHEUNG (Supervisor) Dr. Shuai WANG Prof. Raymond WONG Dr. Zhiyao XIE (ECE) Prof. Minxue PAN (NJU)