More about HKUST
Machine Recognition of Music Emotion and the Correlation with Musical Timbre
PhD Thesis Proposal Defence
Title: "Machine Recognition of Music Emotion and the Correlation with Musical
Timbre"
by
Mr. Bin WU
Abstract:
Music is one of the primary triggers of emotion. Listeners perceive strong
emotions in music, and composers can create emotion-driven music. Researchers
have given more and more attention to this area because of the many interesting
applications such as emotion-based music searching and automatic soundtrack
matching. These applications have motivated research on the correlation between
music features such as timbre and emotion perception. Machine recognition
methods for music emotion have also been developed for automatically
recognizing affective musical content so that it can be indexed and retrieved
on large scale based on emotion.
In this research, our goal is to enable machine to automatically recognize
music emotion. Therefore, we focus on two major topics: 1) understand the
correlation between music emotion and timbre, 2) design algorithms for
automatic music emotion recognition.
To understand the correlation between music emotion and timbre, we designed
listening tests to compare sounds from eight wind and bowed string instruments.
We wanted to know if some sounds were consistently perceived as being happier
or sadder in pairwise comparisons, and which spectral features were most
important aside from spectral centroid. Therefore, we conducted listening tests
of normal sounds, centroid-equalized sounds, as well as static sounds. Our
results showed strong emotional predispositions for each instrument, and that
the even/odd harmonic ratio is perhaps the most salient timbral feature after
attack time and brightness.
To design algorithms for automatic music emotion recognition, we investigated
music emotion's properties. We found that the major problem of automatic music
emotion recognition is lack-of-data, which is due to 1) music emotion is
genre-specific, therefore labeled data for each music category is sparse; 2)
music emotion is time-varying, and there is little time-varying labels for
music emotion. Therefore, in our preliminary study, we have exploited unlabeled
and social tagging data to alleviate problem 1). For problem 2), we have
proposed to exploit time-sync comments data with a novel temporal and personal
topic model, and to exploit lyrics with a novel hierarchical Bayesian model.
Date: Monday, 27 April 2015
Time: 2:00pm - 4:00pm
Venue: Room 3501
lifts 25/26
Committee Members: Prof. Andrew Horner (Supervisor)
Prof. Qiang Yang (Chairperson)
Dr. Jogesh Muppala
Dr. Raymond Wong
**** ALL are Welcome ****