PhD Thesis Proposal Defence
Title: "Advancing Calibration of Language Models for Text Generation"
Mr. Dongkyu LEE
Over the past few years, the natural language processing community has been
witnessing the rapid development in language models. There is a high level of
public interest in these models, particularly generative ones, and language
models have started to appear in user-facing applications. While improving
model performance has always been a key objective, the widespread use of
language models has also brought attention to the importance of modeling
In many scenarios, a prediction by a model is coupled with a corresponding
probability, commonly referred to as a confidence score, and the score adds
meaningful information; the probability indicates how certain a model is in
making the prediction and represents the level of trustworthiness of the
prediction. However, the underlying hypothesis is that the model is calibrated.
When a model is miscalibrated, the confidence score is no longer a meaningful
Model calibration involves refining a model to produce accurate probability
estimates. Specifically, the probability mapped by a model is expected to
accurately reflect the likelihood of a corresponding prediction being correct.
The importance of model calibration is arguably of utmost significance in
natural language generation domain; a probability is merely an indicator of
model’s certainty in other domains, yet with a language model that generates a
text in autoregressive manner, model calibration has a direct impact on the
model outputs. A language model creates a text with a decoding algorithm, and
the decoding scheme utilizes probability distributions mapped by the model.
Therefore, due to the distinct nature of language generation models, model
calibration has a direct impact on a model output, and hence calibration of
language models requires a thorough and extensive study.
In this thesis, we present three novel calibration methods that are
specifically designed for text generation language models. Firstly, we propose
a student-teacher framework that calibrates a language model. Given a
calibrated teacher model, a student model not only benefits from the knowledge
distilled, but also learns to match the calibrated scores mapped by the teacher
model. The next method is a novel regularization scheme that not only improves
model performance, but also reduces calibration error of the model. The
regularizer is a variant of label smoothing, a popular regularization method.
The proposed regularization scheme self-regulates the extent of smoothing based
on the confidence score mapped by the model in training, and hence the language
model is less likely to make predictions with overconfidence, a type of
miscalibration. Lastly, this thesis presents a novel decoding scheme that is
rooted from the concept of model calibration. The long-standing problem of a
language model is repetitions in outputs. We see the problem as a casualty
brought by miscalibration of a language model. In this context, our novel
decoding scheme applies post-hoc calibration in the course of inference.
Date: Wednesday, 27 September 2023
Time: 11:00am - 1:00pm
Venue: Room 5510
Committee Members: Prof. Nevin Zhang (Supervisor)
Prof. Fangzhen Lin (Chairperson)
Dr. Brian Mak
Dr. Yangqiu Song
**** ALL are Welcome ****