More about HKUST
Tempo Extraction using the Discrete Wavelet Transform
Speaker: Dr. David ROSSITER Department of Computer Science and Engineering Hong Kong University of Science and Technology Title: "Tempo Extraction using the Discrete Wavelet Transform" Date: Monday, 13 November 2006 Time: 4:00pm - 5:00pm Venue: Lecture Theatre F (Leung Yat Sing Lecture Theatre, near lift nos. 25/26) HKUST Abstract: This work presents a method to extract the tempo (that is, the speed) of a recording of music. This work has particular applications in the field of multimedia search engines and music/ audio classification. The interval between two successive beats in a musical sequence is called the inter-onset interval (IOI), a parameter of considerable significance in many tempo estimation algorithms. In order to investigate IOI characteristics we created a data set of 50 musical recordings extracted from audio CDs. Two musicians were then studied while they reproduced the beat sequence of the music recordings. The resulting observations of IOI were then incorporated into our tempo extraction system. Our tempo extraction system operates by first applying a discrete wavelet transform (DWT) to the audio file. The input signal is then decomposed into four levels of DWT coefficients and a peak detection algorithm is applied to extract all peaks from these DWT coefficients, incorporating the observations concerning IOI made previously. All peaks are used to calculate the IOI, with appropriate weighting. All the weighted IOIs form a histogram. The histogram is then smoothed out using a Gaussian function. Audio input which is in stereo format is treated as three different inputs; the left channel, the right channel and the mono channel. We pass these three inputs into our system and automatically select the best one to be our final tempo result. We tested our system using our data set of 50 musical recordings and data used in a tempo extraction contest during the International Conference on Music Information Retrieval (ISMIR 2004). We obtained the correct tempo for 47 out of the 50 songs in our data set, achieving high accuracy. For the contest our ranking for one musical data set was 2nd out of 12 and for the other it was 3rd out of 12. This result shows that our system is competitive with others used in the contest. This is joint work with Mr. TSANG, Raymond Kei-Man. ************************ Biography: Dr David Rossiter is a Visiting Assistant Professor in the Department of Computer Science and Engineering at HKUST. His research interests lie in the field of multimedia processing, particularly for audio.