Tempo Extraction using the Discrete Wavelet Transform

Speaker:	Dr. David ROSSITER
		Department of Computer Science and Engineering
		Hong Kong University of Science and Technology

Title:		"Tempo Extraction using the Discrete Wavelet Transform"

Date:		Monday, 13 November 2006

Time:		4:00pm - 5:00pm

Venue:		Lecture Theatre F
		(Leung Yat Sing Lecture Theatre, near lift nos. 25/26)
		HKUST

Abstract:

This work presents a method to extract the tempo (that is, the speed) of a
recording of music. This work has particular applications in the field of
multimedia search engines and music/ audio classification.

The interval between two successive beats in a musical sequence is called
the inter-onset interval (IOI), a parameter of considerable significance
in many tempo estimation algorithms. In order to investigate IOI
characteristics we created a data set of 50 musical recordings extracted
from audio CDs. Two musicians were then studied while they reproduced the
beat sequence of the music recordings. The resulting observations of IOI
were then incorporated into our tempo extraction system. Our tempo
extraction system operates by first applying a discrete wavelet transform
(DWT) to the audio file. The input signal is then decomposed into four
levels of DWT coefficients and a peak detection algorithm is applied to
extract all peaks from these DWT coefficients, incorporating the
observations concerning IOI made previously. All peaks are used to
calculate the IOI, with appropriate weighting. All the weighted IOIs form
a histogram. The histogram is then smoothed out using a Gaussian function.
Audio input which is in stereo format is treated as three different
inputs; the left channel, the right channel and the mono channel. We pass
these three inputs into our system and automatically select the best one
to be our final tempo result.

We tested our system using our data set of 50 musical recordings and data
used in a tempo extraction contest during the International Conference on
Music Information Retrieval (ISMIR 2004). We obtained the correct tempo
for 47 out of the 50 songs in our data set, achieving high accuracy. For
the contest our ranking for one musical data set was 2nd out of 12 and for
the other it was 3rd out of 12. This result shows that our system is
competitive with others used in the contest.

This is joint work with Mr. TSANG, Raymond Kei-Man.


************************
Biography:

Dr David Rossiter is a Visiting Assistant Professor in the Department of
Computer Science and Engineering at HKUST. His research interests lie in
the field of multimedia processing, particularly for audio.