An Investigation of Using Quasi-tri-syllables as the Basic Acoustic Modeling Units for Automatic Speech Recognition

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


FYT Presentation and Demonstration


Title: "An Investigation of Using Quasi-tri-syllables as the Basic Acoustic
Modeling Units for Automatic Speech Recognition"

by

Miss Chu Yeuk Ting, Rovenna


Abstract

The performance and reliability of current automatic speech recognition
(ASR) systems are still not satisfactory enough in terms of accuracy and
speed, especially in continuous speech recognition. The most commonly used
method of modeling speech acoustics has been using phone-based approaches
like tri-phone, which considers individual phoneme in a word together with
the neighbouring phonemes. However, the phone-based approach to ASR has some
limitations, including difficulties in handling phonetic variations,
especially phone deletion, excessive number of commonly occurring patterns
and incapability in capturing long scale temporal variations. These factors
can decrease the effectiveness of tri-phone in recognising continuous
speech. On the other hand, languages like Mandarin, Cantonese and Japanese
are syllable-based languages and it might be more appropriate to include
syllables in the basic modeling unit in those languages.

This paper presents preliminary but encouraging results on using a new
acoustic unit - Quasi-tri-syllables (QTS) in reducing the word error rate of
the acoustic models. Text analysis on the English, Cantonese and Mandarin
corpora has been performed and the results show that QTS can help in
increasing the scale of temporal dependencies and also reducing the number
of distinct models with poorly trained parameters.


Date		:  30 April 2008 (Wednesday)

Time		:  4-5pm

Venue		:  3315

Advisor		:  Dr. Brian Mak

2nd Reader	:  Dr. D.Y. Yeung


*******  ALL are Welcome  *******

Privacy Sitemap

An Investigation of Using Quasi-tri-syllables as the Basic Acoustic Modeling Units for Automatic Speech Recognition

About

People

Research

Academics

Admissions