Multi-Modal Speech Emotion Recognition

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

Final Year Thesis Oral Defense

Title: "Multi-Modal Speech Emotion Recognition"

by

HU Chenxi

Abstract:

Speech Emotion Recognition has been a popular research topic due to many 
promising applications in areas such as healthcare, education, and customer 
services. Similar to sentiment analysis, the advancement in the deep learning 
field has greatly accelerated the research in SER. Besides, more available 
speech data have also paved the way for the research. The integration of 
multi-modal information also enriches the meaning conveyed, boosting models' 
peformance. For instance, we can detect sarcasm and lies with disharmony in 
people's facial expressions and speech. There has been more available data as 
different social media all introduced posting with pictures, video, text, etc. 
In this study, we aim to study how multi-modality improves the efficiency of 
SER.


Date            : 3 May 2024 (Friday)

Time            : 10:30 - 11:10

Venue           : Room 2611 (near lifts 31/32), HKUST

Advisor         : Dr. MAK Brian Kan-Wing

2nd Reader      : Prof. CHAN Gary Shueng-Han