Neural Question Generation and Evaluation with Large Language Models

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


MPhil Thesis Defence


Title: "Neural Question Generation and Evaluation with Large Language Models"

By

Mr. Chun Yi Louis BO


Abstract:

This thesis explores the application of Large Language Models (LLMs) in Neural 
Question Generation (NQG) and Evaluation, addressing the challenges of 
generating high-quality, thought-provoking questions for educational purposes. 
Traditional rule-based Automatic Question Generation (AQG) methods, although 
effective for simple texts, often lack flexibility and diversity. Despite 
continuous advancements in the field, a significant gap remains between 
expert-generated and machine-generated questions. Furthermore, traditional 
evaluation metrics are limited by the quality of reference questions and often 
fail to account for creativity and diversity.

This research leverages the capabilities of LLMs, such as GPT-4o and Claude 3, 
to enhance question generation and evaluation. Two novel qualitative evaluation 
methods are proposed: a question-answering approach and a rubric-based 
approach. Both methods show significant improvements over traditional metrics 
by correlating more closely with human annotations. In particular, the proposed 
metrics outperformed the best existing metrics by up to +40% absolute Pearson 
correlation for the grammatically and relevance criteria, and up to +7% for the 
answerability criterion.

Furthermore, the thesis explores the capabilities of LLMs in generating 
educational questions. The findings indicate that models such as GPT-4o can 
generate questions that are not only grammatically correct but also relevant 
and answerable, thereby surpassing current specialized QG models both 
qualitatively and quantitatively. The study also investigates a self-reflective 
framework, allowing LLMs to improve their own generated questions based on 
feedback. Results show promising improvements in grammaticality, although 
challenges remain in enhancing the answerability of more complex questions.


Date:                   Friday, 9 August 2024

Time:                   10:00am - 12:00noon

Venue:                  Room 5501
                        Lifts 25/26

Chairman:               Prof. Raymond WONG

Committee Members:      Dr. Yangqiu SONG (Supervisor)
                        Prof. Gary CHAN