More about HKUST
Improving Humour Recognition through the use of Word Associations, Humour Anchor Identification, and World Knowledge Features
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Improving Humour Recognition through the use of Word Associations, Humour Anchor Identification, and World Knowledge Features" By Mr. Andrew CATTLE Abstract As natural language interfaces become more prevalent, the ability for computers to both understand and create humour becomes more important. Humour is a ubiquitous part of human communication. It can be used to make one’s self more likeable, to defuse a tense situation, or just for pure entertainment. As modern digital virtual assistants such as Alexa, Cortana, Google Assistant, and Siri become more human-like, the ability to effectively recognise, interpret, and even produce humour becomes more important. What makes humour such an exciting challenge is that it requires not only linguistic dexterity but also world/domain knowledge. Syntax, phonology, and semantics all play a role in making a joke funny. However, existing humour recognition works have typically taken a fairly basic view of joke semantics, structure, and world knowledge; treating jokes as unordered bags-of-words and simply computing word embedding similarities between all word pairs. This bears little resemblance to the way humans actually interpret humour. This thesis will address these shortcomings in three ways. First, we motivate the use of a semantic relatedness measure based on word associations for better capturing joke semantics. Furthermore, we present evidence that word associations outperform Word2Vec similarity on both humour classification and humour ranking tasks across several datasets. Word associations’ focus on relatedness over similarity offers an increased flexibility and the ability to capture weaker, more tangential relationships between concepts. Word associations also better represent the way humans store their mental lexicons. We experiment with extracting word association features using both a graph-based method, which is efficient to calculate but suffers from coverage issues, and a more sophisticated word association strength prediction model, which capable of predicting association strengths between arbitrary word pairs. Second, we explore the usefulness of humour anchors for incorporating joke structure. Specifically, we utilize automatic humour anchor extraction as a form of setup/punchline annotation and use this information to help target semantic features. Finally, we experiment with adding world knowledge to our humour recognition system through the inclusion of ConceptNet-derived features. ConceptNet is commonsense knowledge base capable of representing complex real-world relationships between concepts which are unlikely to be represented by more conventional knowledge representation features like word embeddings. Date: Monday, 27 August 2018 Time: 1:00pm - 3:00pm Venue: Room 4472 Lifts 25/26 Chairman: Prof. Irene Lo (CIVL) Committee Members: Prof. Xiaojuan Ma (Supervisor) Prof. Fangzhen Lin Prof. Dit-Yan Yeung Prof. Yi Yang (ISOM) Prof. Jordan BOYD-GRABER (U of Maryland) **** ALL are Welcome ****