More about HKUST
Advancing Calibration of Language Models for Text Generation
PhD Thesis Proposal Defence
Title: "Advancing Calibration of Language Models for Text Generation"
by
Mr. Dongkyu LEE
Abstract:
Over the past few years, the natural language processing community has been
witnessing the rapid development in language models. There is a high level of
public interest in these models, particularly generative ones, and language
models have started to appear in user-facing applications. While improving
model performance has always been a key objective, the widespread use of
language models has also brought attention to the importance of modeling
“reliable” probabilities.
In many scenarios, a prediction by a model is coupled with a corresponding
probability, commonly referred to as a confidence score, and the score adds
meaningful information; the probability indicates how certain a model is in
making the prediction and represents the level of trustworthiness of the
prediction. However, the underlying hypothesis is that the model is calibrated.
When a model is miscalibrated, the confidence score is no longer a meaningful
indicator.
Model calibration involves refining a model to produce accurate probability
estimates. Specifically, the probability mapped by a model is expected to
accurately reflect the likelihood of a corresponding prediction being correct.
The importance of model calibration is arguably of utmost significance in
natural language generation domain; a probability is merely an indicator of
model’s certainty in other domains, yet with a language model that generates a
text in autoregressive manner, model calibration has a direct impact on the
model outputs. A language model creates a text with a decoding algorithm, and
the decoding scheme utilizes probability distributions mapped by the model.
Therefore, due to the distinct nature of language generation models, model
calibration has a direct impact on a model output, and hence calibration of
language models requires a thorough and extensive study.
In this thesis, we present three novel calibration methods that are
specifically designed for text generation language models. Firstly, we propose
a student-teacher framework that calibrates a language model. Given a
calibrated teacher model, a student model not only benefits from the knowledge
distilled, but also learns to match the calibrated scores mapped by the teacher
model. The next method is a novel regularization scheme that not only improves
model performance, but also reduces calibration error of the model. The
regularizer is a variant of label smoothing, a popular regularization method.
The proposed regularization scheme self-regulates the extent of smoothing based
on the confidence score mapped by the model in training, and hence the language
model is less likely to make predictions with overconfidence, a type of
miscalibration. Lastly, this thesis presents a novel decoding scheme that is
rooted from the concept of model calibration. The long-standing problem of a
language model is repetitions in outputs. We see the problem as a casualty
brought by miscalibration of a language model. In this context, our novel
decoding scheme applies post-hoc calibration in the course of inference.
Date: Wednesday, 27 September 2023
Time: 11:00am - 1:00pm
Venue: Room 5510
lifts 25/26
Committee Members: Prof. Nevin Zhang (Supervisor)
Prof. Fangzhen Lin (Chairperson)
Dr. Brian Mak
Dr. Yangqiu Song
**** ALL are Welcome ****