Advancing Calibration of Language Models for Text Generation

PhD Thesis Proposal Defence


Title: "Advancing Calibration of Language Models for Text Generation"

by

Mr. Dongkyu LEE


Abstract:

Over the past few years, the natural language processing community has been 
witnessing the rapid development in language models. There is a high level of 
public interest in these models, particularly generative ones, and language 
models have started to appear in user-facing applications. While improving 
model performance has always been a key objective, the widespread use of 
language models has also brought attention to the importance of modeling 
“reliable” probabilities.

In many scenarios, a prediction by a model is coupled with a corresponding 
probability, commonly referred to as a confidence score, and the score adds 
meaningful information; the probability indicates how certain a model is in 
making the prediction and represents the level of trustworthiness of the 
prediction. However, the underlying hypothesis is that the model is calibrated. 
When a model is miscalibrated, the confidence score is no longer a meaningful 
indicator.

Model calibration involves refining a model to produce accurate probability 
estimates. Specifically, the probability mapped by a model is expected to 
accurately reflect the likelihood of a corresponding prediction being correct. 
The importance of model calibration is arguably of utmost significance in 
natural language generation domain; a probability is merely an indicator of 
model’s certainty in other domains, yet with a language model that generates a 
text in autoregressive manner, model calibration has a direct impact on the 
model outputs. A language model creates a text with a decoding algorithm, and 
the decoding scheme utilizes probability distributions mapped by the model. 
Therefore, due to the distinct nature of language generation models, model 
calibration has a direct impact on a model output, and hence calibration of 
language models requires a thorough and extensive study.

In this thesis, we present three novel calibration methods that are 
specifically designed for text generation language models. Firstly, we propose 
a student-teacher framework that calibrates a language model. Given a 
calibrated teacher model, a student model not only benefits from the knowledge 
distilled, but also learns to match the calibrated scores mapped by the teacher 
model. The next method is a novel regularization scheme that not only improves 
model performance, but also reduces calibration error of the model. The 
regularizer is a variant of label smoothing, a popular regularization method. 
The proposed regularization scheme self-regulates the extent of smoothing based 
on the confidence score mapped by the model in training, and hence the language 
model is less likely to make predictions with overconfidence, a type of 
miscalibration. Lastly, this thesis presents a novel decoding scheme that is 
rooted from the concept of model calibration. The long-standing problem of a 
language model is repetitions in outputs. We see the problem as a casualty 
brought by miscalibration of a language model. In this context, our novel 
decoding scheme applies post-hoc calibration in the course of inference.


Date:			Wednesday, 27 September 2023

Time:                  	11:00am - 1:00pm

Venue:                  Room 5510
                         lifts 25/26

Committee Members:	Prof. Nevin Zhang (Supervisor)
 			Prof. Fangzhen Lin (Chairperson)
 			Dr. Brian Mak
 			Dr. Yangqiu Song


**** ALL are Welcome ****