Improving Humour Recognition through the use of Word Associations, Humour Anchor Identification, and World Knowledge Features

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Improving Humour Recognition through the use of Word Associations, 
Humour Anchor Identification, and World Knowledge Features"

By

Mr. Andrew CATTLE


Abstract

As natural language interfaces become more prevalent, the ability for 
computers to both understand and create humour becomes more important. 
Humour is a ubiquitous part of human communication. It can be used to make 
one’s self more likeable, to defuse a tense situation, or just for pure 
entertainment. As modern digital virtual assistants such as Alexa, 
Cortana, Google Assistant, and Siri become more human-like, the ability to 
effectively recognise, interpret, and even produce humour becomes more 
important.

What makes humour such an exciting challenge is that it requires not only 
linguistic dexterity but also world/domain knowledge. Syntax, phonology, 
and semantics all play a role in making a joke funny. However, existing 
humour recognition works have typically taken a fairly basic view of joke 
semantics, structure, and world knowledge; treating jokes as unordered 
bags-of-words and simply computing word embedding similarities between all 
word pairs. This bears little resemblance to the way humans actually 
interpret humour.

This thesis will address these shortcomings in three ways. First, we 
motivate the use of a semantic relatedness measure based on word 
associations for better capturing joke semantics. Furthermore, we present 
evidence that word associations outperform Word2Vec similarity on both 
humour classification and humour ranking tasks across several datasets. 
Word associations’ focus on relatedness over similarity offers an 
increased flexibility and the ability to capture weaker, more tangential 
relationships between concepts. Word associations also better represent 
the way humans store their mental lexicons. We experiment with extracting 
word association features using both a graph-based method, which is 
efficient to calculate but suffers from coverage issues, and a more 
sophisticated word association strength prediction model, which capable of 
predicting association strengths between arbitrary word pairs.

Second, we explore the usefulness of humour anchors for incorporating joke 
structure. Specifically, we utilize automatic humour anchor extraction as 
a form of setup/punchline annotation and use this information to help 
target semantic features.

Finally, we experiment with adding world knowledge to our humour 
recognition system through the inclusion of ConceptNet-derived features. 
ConceptNet is commonsense knowledge base capable of representing complex 
real-world relationships between concepts which are unlikely to be 
represented by more conventional knowledge representation features like 
word embeddings.


Date:			Monday, 27 August 2018

Time:			1:00pm - 3:00pm

Venue:			Room 4472
 			Lifts 25/26

Chairman:		Prof. Irene Lo (CIVL)

Committee Members:	Prof. Xiaojuan Ma (Supervisor)
 			Prof. Fangzhen Lin
 			Prof. Dit-Yan Yeung
 			Prof. Yi Yang (ISOM)
 			Prof. Jordan BOYD-GRABER (U of Maryland)


**** ALL are Welcome ****