More about HKUST
IMPROVING DEEP CONVERSATIONAL MODELS VIA INPUT AUGMENTATION AND DATA SOURCE EXPANSION
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "IMPROVING DEEP CONVERSATIONAL MODELS VIA INPUT AUGMENTATION AND DATA SOURCE EXPANSION" By Mr. Zhiliang TIAN Abstract Conversational models aim to generate readable textual responses to input queries. In recent years, conversational models are typically built using deep neural networks, which are called deep conversational models (DCMs). DCMs can be used for task-oriented dialogues or chit-chat conversations. In this thesis, we investigate ways to enhance DCMs for chit-chat conversations via input augmentation and data source expansion. DCM maps an input query to some output responses. It has been observed in previous work that the output is often uninformative and lacks diversity due to two reasons: (1) The input does not contain su cient information to determine an appropriate output, and (2) the DCM is trained via likelihood maximization and hence captures only the most salient input-output relationships. One common method to address the rst issue is to include background documents as additional inputs to the DCM, and one popular way to deal with the second issue is to include retrieved responses to similar input queries as additional inputs to the DCM. In this thesis, we advance the state-of-the-art in both of those two lines of work. For the rst line of work, we propose an output-anticipated memory module to enable the DCM to better attend to the relevant information in the background documents. For the second line of work, we develop a memory module to extract relationships between clusters of similar inputs and clusters of outputs (which xv are more robust than relationships between individual inputs and outputs), and use the relationships to improve the performance of the DCM. Nowadays, DCMs are often trained in huge corpora. However, there are still scenarios with low resources. One example is an online chatbot that needs to quickly adapt to a new user after a few rounds of conversations. Our third contribution in this thesis is a meta-learning-based method to help with the adaptation by utilizing data from the user's friends, who are expected to have similar interests and expectations. There are also applications, such as conversations on airline booking, where there are limited public data and abundant private data containing sensitive information. Our fourth contribution to this thesis is a teacher-student framework to train DCMs on both private and public data while ensuring the privacy of sensitive information in the private data. Date: Friday, 20 May 2022 Time: 2:00pm - 4:00pm Zoom Meeting: https://hkust.zoom.us/j/96706339048?pwd=eEFxeFhBcFVpMnZXQ2ZJWStDeXB5QT09 Chairperson: Prof. Ling SHI (ECE) Committee Members: Prof. Nevin ZHANG (Supervisor) Prof. Brain MAK Prof. Yangqiu SONG Prof. Bing-yi JING (MATH) Prof. Wai LAM (CUHK) **** ALL are Welcome ****