More about HKUST
Advancing Calibration of Language Models for Text Generation
PhD Thesis Proposal Defence Title: "Advancing Calibration of Language Models for Text Generation" by Mr. Dongkyu LEE Abstract: Over the past few years, the natural language processing community has been witnessing the rapid development in language models. There is a high level of public interest in these models, particularly generative ones, and language models have started to appear in user-facing applications. While improving model performance has always been a key objective, the widespread use of language models has also brought attention to the importance of modeling “reliable” probabilities. In many scenarios, a prediction by a model is coupled with a corresponding probability, commonly referred to as a confidence score, and the score adds meaningful information; the probability indicates how certain a model is in making the prediction and represents the level of trustworthiness of the prediction. However, the underlying hypothesis is that the model is calibrated. When a model is miscalibrated, the confidence score is no longer a meaningful indicator. Model calibration involves refining a model to produce accurate probability estimates. Specifically, the probability mapped by a model is expected to accurately reflect the likelihood of a corresponding prediction being correct. The importance of model calibration is arguably of utmost significance in natural language generation domain; a probability is merely an indicator of model’s certainty in other domains, yet with a language model that generates a text in autoregressive manner, model calibration has a direct impact on the model outputs. A language model creates a text with a decoding algorithm, and the decoding scheme utilizes probability distributions mapped by the model. Therefore, due to the distinct nature of language generation models, model calibration has a direct impact on a model output, and hence calibration of language models requires a thorough and extensive study. In this thesis, we present three novel calibration methods that are specifically designed for text generation language models. Firstly, we propose a student-teacher framework that calibrates a language model. Given a calibrated teacher model, a student model not only benefits from the knowledge distilled, but also learns to match the calibrated scores mapped by the teacher model. The next method is a novel regularization scheme that not only improves model performance, but also reduces calibration error of the model. The regularizer is a variant of label smoothing, a popular regularization method. The proposed regularization scheme self-regulates the extent of smoothing based on the confidence score mapped by the model in training, and hence the language model is less likely to make predictions with overconfidence, a type of miscalibration. Lastly, this thesis presents a novel decoding scheme that is rooted from the concept of model calibration. The long-standing problem of a language model is repetitions in outputs. We see the problem as a casualty brought by miscalibration of a language model. In this context, our novel decoding scheme applies post-hoc calibration in the course of inference. Date: Wednesday, 27 September 2023 Time: 11:00am - 1:00pm Venue: Room 5510 lifts 25/26 Committee Members: Prof. Nevin Zhang (Supervisor) Prof. Fangzhen Lin (Chairperson) Dr. Brian Mak Dr. Yangqiu Song **** ALL are Welcome ****