IMPROVING DEEP CONVERSATIONAL MODELS VIA INPUT AUGMENTATION AND DATA SOURCE EXPANSION

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "IMPROVING DEEP CONVERSATIONAL MODELS VIA INPUT AUGMENTATION AND DATA 
SOURCE EXPANSION"

By

Mr. Zhiliang TIAN


Abstract

Conversational models aim to generate readable textual responses to input 
queries. In recent years, conversational models are typically built using deep 
neural networks, which are called deep conversational models (DCMs). DCMs can 
be used for task-oriented dialogues or chit-chat conversations. In this thesis, 
we investigate ways to enhance DCMs for chit-chat conversations via input 
augmentation and data source expansion.

DCM maps an input query to some output responses. It has been observed in 
previous work that the output is often uninformative and lacks diversity due to 
two reasons: (1) The input does not contain su cient information to determine 
an appropriate output, and (2) the DCM is trained via likelihood maximization 
and hence captures only the most salient input-output relationships. One common 
method to address the rst issue is to include background documents as 
additional inputs to the DCM, and one popular way to deal with the second issue 
is to include retrieved responses to similar input queries as additional inputs 
to the DCM. In this thesis, we advance the state-of-the-art in both of those 
two lines of work. For the rst line of work, we propose an output-anticipated 
memory module to enable the DCM to better attend to the relevant information in 
the background documents. For the second line of work, we develop a memory 
module to extract relationships between clusters of similar inputs and clusters 
of outputs (which xv are more robust than relationships between individual 
inputs and outputs), and use the relationships to improve the performance of 
the DCM.

Nowadays, DCMs are often trained in huge corpora. However, there are still 
scenarios with low resources. One example is an online chatbot that needs to 
quickly adapt to a new user after a few rounds of conversations. Our third 
contribution in this thesis is a meta-learning-based method to help with the 
adaptation by utilizing data from the user's friends, who are expected to have 
similar interests and expectations. There are also applications, such as 
conversations on airline booking, where there are limited public data and 
abundant private data containing sensitive information. Our fourth contribution 
to this thesis is a teacher-student framework to train DCMs on both private and 
public data while ensuring the privacy of sensitive information in the private 
data.


Date:			Friday, 20 May 2022

Time:			2:00pm - 4:00pm

Zoom Meeting: 
https://hkust.zoom.us/j/96706339048?pwd=eEFxeFhBcFVpMnZXQ2ZJWStDeXB5QT09

Chairperson:		Prof. Ling SHI (ECE)

Committee Members:	Prof. Nevin ZHANG (Supervisor)
 			Prof. Brain MAK
 			Prof. Yangqiu SONG
 			Prof. Bing-yi JING (MATH)
 			Prof. Wai LAM (CUHK)


**** ALL are Welcome ****