More about HKUST
Word Sense Disambiguation for Statistical Machine Translation
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Word Sense Disambiguation for Statistical Machine Translation" By Miss Marine Jacinthe Carpuat Abstract In this thesis, we show for the first time that lexical semantics modelling is useful in Statistical Machine Translation (SMT). Word Sense Disambiguation (WSD), the task of resolving sense ambiguity to identify the right translation of a word is one of the major challenges faced by language translation systems. If the English word "drug" translates into French as either "drogue" (used as a narcotic) or "medicament" (used as a medicine), then an English-French machine translation system needs to disambiguate every use of "drug" in order to make the correct translations. Heavy effort has been put in designing and evaluating dedicated WSD models, in particular with the Senseval series of workshops. This is partly motivated by the often unstated assumption that any full translation system, to achieve full performance, will sooner or later have to incorporate individual WSD components. However, in most machine translation architectures, in particular SMT, the WSD problem is typically not explicitly addressed. This paradoxical situation encouraged speculation that recent progress in SMT shows that SMT models are already very good at WSD and that current WSD systems have nothing to offer to state-of-the-art SMT. Going beyond these untested assumptions and speculative claims, we conduct the first direct extensive empirical study of the strengths and weaknesses of WSD and SMT. Using the state-of-the-art HKUST WSD system, we surprisingly show that incorporating WSD predictions in SMT does not help translation quality. Puzzlingly, we also report results suggesting that typical SMT models cannot disambiguate word translations as well as dedicated WSD systems. These seemingly contradictory results lead us to generalize conventional WSD models to incorporate assumptions at least as strong as in state-of-the-art SMT. Specifically, (1) WSD targets are generalized from words to phrases, (2) WSD sense inventories and annotation are learned automatically in the same way as conventional SMT translation lexicons, and (3) WSD models are fully integrated in SMT decoding. Remarkably, the resulting generalized phrasal WSD-for-SMT models improve translation quality across four different Chinese-to-English translation tasks, as measured by eight common automatic evaluation metrics. Further analysis reveals that generalization from conventional WSD to generalized phrasal WSD-for-SMT is necessary in order to obtain consistent improvements in translation quality. Date: Monday, 14 January 2008 Time: 2:00p.m.-4:00p.m. Venue: Room 3301 Lifts 17-18 Chairman: Prof. Andrew Miller (BIOL) Committee Members: Prof. Dekai Wu (Supervisor) Prof. Brian Mak Prof. Dit-Yan Yeung Prof. Pascale Fung (ECE) Prof. Bertram Shi (ECE) Prof. Dragomir Radev (EECS, Univ. of Michigan) **** ALL are Welcome ****