More about HKUST
Towards Bridging Human Requirements and Program Synthesis
PhD Thesis Proposal Defence
Title: "Towards Bridging Human Requirements and Program Synthesis"
by
Mr. Jiarong WU
Abstract:
Program synthesis aims to synthesize programs meeting the user-given
specifications such as input-output examples and natural language (NL)
descriptions. Recent advancements in both symbolic and neural synthesis
methodologies, along with improvements in computational hardware, have
transformed this once formidable research challenge into practical techniques
that facilitate billions of users. Despite the significant progress made,
particularly in speed and scalability, program synthesizers still face
challenges in faithfully realizing human requirements due to their intrinsic
diversity, ambiguity, and complexity.
This thesis aims to enhance the mechanisms to communicate human requirements
to program synthesizers and consists of the following three studies.
The first study introduces a programming-by-example (PBE) framework to ease
PBE development in diverse domains. Domain-specific PBE synthesizers, such as
Microsoft FlashFill in Excel, boost user efficiency by synthesizing programs
to automate repetitive tasks in the application domains. Although the
application domains are diverse, the demanding requirement of implementing a
PBE synthesizer hinders the wider adoption. Therefore, Bee, a PBE framework
based on an innovative relational tables-based developer interface is
proposed to ease the customization for diverse application domains. The
evaluation shows Bee is more accessible to novice developers than the classic
program synthesis frameworks while maintaining comparable performance.
The second study enhances the probabilistic model used in interactive program
synthesis for resolving ambiguity. PBE synthesizers usually find multiple
programs that are consistent with the example specification, and one of the
effective approaches to identify the user-intended one is through
question-answer interaction. The questions selected for users to answer is
crucial to the interaction effort measured by the number of rounds, and
selecting the optimal questions depends on a probabilistic distribution of
programs. This study proposes using large language models (LLMs) to capture
user intentions in the NL descriptions alongside examples to yield a more
user-aligned probabilistic model. The evaluation shows the effectiveness of
the approach in reducing user interaction efforts.
The third study empirically investigates the effectiveness of pseudocode as
an intermediate representation (IR) for isolating the complexity of problem
solving and guiding LLMs in synthesizing executable and correct program
implementations in different programming languages. Experimental results
confirm that pseudocode not only preserves essential semantics but also
provides a simple and effective means of communication with LLMs, shedding
light on program synthesis from complex requirements via the pseudocode IR.
Date: Tuesday, 29 April 2025
Time: 8:00am - 10:00am
Venue: Room 2130A
Lift 19
Committee Members: Prof. Shing-Chi Cheung (Supervisor)
Dr. Shuai Wang (Chairperson)
Dr. May Fung
Dr. Dongdong She