More about HKUST
An Investigation of Using Quasi-tri-syllables as the Basic Acoustic Modeling Units for Automatic Speech Recognition
The Hong Kong University of Science and Technology Department of Computer Science and Engineering FYT Presentation and Demonstration Title: "An Investigation of Using Quasi-tri-syllables as the Basic Acoustic Modeling Units for Automatic Speech Recognition" by Miss Chu Yeuk Ting, Rovenna Abstract The performance and reliability of current automatic speech recognition (ASR) systems are still not satisfactory enough in terms of accuracy and speed, especially in continuous speech recognition. The most commonly used method of modeling speech acoustics has been using phone-based approaches like tri-phone, which considers individual phoneme in a word together with the neighbouring phonemes. However, the phone-based approach to ASR has some limitations, including difficulties in handling phonetic variations, especially phone deletion, excessive number of commonly occurring patterns and incapability in capturing long scale temporal variations. These factors can decrease the effectiveness of tri-phone in recognising continuous speech. On the other hand, languages like Mandarin, Cantonese and Japanese are syllable-based languages and it might be more appropriate to include syllables in the basic modeling unit in those languages. This paper presents preliminary but encouraging results on using a new acoustic unit - Quasi-tri-syllables (QTS) in reducing the word error rate of the acoustic models. Text analysis on the English, Cantonese and Mandarin corpora has been performed and the results show that QTS can help in increasing the scale of temporal dependencies and also reducing the number of distinct models with poorly trained parameters. Date : 30 April 2008 (Wednesday) Time : 4-5pm Venue : 3315 Advisor : Dr. Brian Mak 2nd Reader : Dr. D.Y. Yeung ******* ALL are Welcome *******