More about HKUST
Towards Bridging Human Requirements and Program Synthesis
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "Towards Bridging Human Requirements and Program Synthesis"
By
Mr. Jiarong WU
Abstract:
Program synthesis aims to synthesize programs meeting the user-given
specifications, such as input-output examples and natural language
descriptions. Recent advancements in both symbolic and neural synthesis
methodologies, along with improvements in computational hardware, have
transformed this once formidable research challenge into practical techniques
that facilitate billions of users. Despite the significant progress made,
particularly in speed and scalability, program synthesizers still face
challenges in faithfully realizing human requirements due to their intrinsic
diversity, ambiguity, and complexity.
This thesis aims to enhance the communication of human requirements to
program synthesizers by studying the gap between them and proposing
user-friendly interaction with efficient synthesis techniques to improve the
usability and efficiency of program synthesizers. This thesis includes the
following three studies.
The first study introduces a programming-by-example (PBE) framework to ease
PBE development in diverse domains. Domain-specific PBE synthesizers, such as
Microsoft FlashFill in Excel, boost user efficiency by synthesizing programs
to automate repetitive tasks in the application domains. However, the
demanding requirement of implementing a PBE synthesizer hinders the adoption
of PBE in wider application domains. Therefore, we propose Bee, a PBE
framework with an innovative developer interface based on relational tables,
to ease domain-specific customization. The evaluation shows Bee is more
accessible to novice developers than the classic program synthesis frameworks
while maintaining comparable performance.
The second study enhances the probabilistic model used in interactive program
synthesis for resolving ambiguity in PBE, where multiple programs consistent
with the example specification can be found. Question-answer interaction with
users can effectively identify the user-intended program, and the question
selection problem aims to select the questions that minimize the interaction
effort, specifically the expected value of interaction rounds. The
measurement of the probabilities of example-consistent programs is key to the
question selection performance. Therefore, this study proposes using large
language models (LLMs) to capture user intentions in the natural language
descriptions alongside examples to yield a more user-aligned probabilistic
model. The evaluation investigates the characteristics of different
probabilistic models and confirms the effectiveness of our approach.
The third study empirically investigates the decoupling of code generation’s
complexity into (a) problem-solving and (b) implementing a sketched solution
in a programming language (PL). Using pseudocode to represent sketched
solutions, we compare the code generation performance from pseudocode or
problem descriptions in different PLs. Analysis of experimental results shows
that (1) the complexity of problem-solving is the performance bottleneck, (2)
the implementation complexity in different PLs has significant differences,
and (3) pseudocode can preserve essential semantics of solutions and serve as
an effective, PL-agnostic means of communication with LLMs.
Date: Tuesday, 29 July 2025
Time: 3:30pm - 5:30pm
Venue: Room 5501
Lifts 25/26
Chairman: Prof. David Edward COOK (ECON)
Committee Members: Prof. Shing-Chi CHEUNG (Supervisor)
Dr. Shuai WANG
Prof. Raymond WONG
Dr. Zhiyao XIE (ECE)
Prof. Minxue PAN (NJU)