Making sense in translation: Addressing lexical choice errors when translating across domains

======================================================================
                Joint Seminar
======================================================================
The Hong Kong University of Science & Technology
Human Language Technology Center
Department of Computer Science and Engineering
Department of Electronic and Computer Engineering
---------------------------------------------------------------------
Speaker:        Dr. Marine CARPUAT
                National Research Council
                Canada

Title:          "Making sense in translation: Addressing lexical choice
                 errors when translating across domains"

Date:           Monday, 28 October 2013

Time:           4:00pm - 5:00pm

Venue:          Lecture Theater F (near lifts 25 or 26), HKUST

Abstract:

While Statistical Machine Translation has achieved significant progress in
recent years, state-of-the-art systems cannot yet be trusted to convey the
correct semantics of the original language. Performance is particularly
poor when systems are applied on test domains that differ from their
training domain.

In this talk, we will first present an analysis of lexical choice errors
observed when porting a French-English system trained on the Canadian
Hansard to very different new domains (e.g., scientific papers or movie
subtitles). We will show that many errors fall into a category that has
not received much attention to date: French words that acquire new senses
in the new domain. For instance, the word "régime" is frequently used in
the "political regime" sense in the Hansard, while the previously unseen
"diet" sense is more frequent in scientific articles.

Second, we will introduce a novel approach for detecting such words
automatically, using cues inspired from word sense disambiguation and
induction models. This case study highlights potential for future research
at the intersection of machine translation and lexical semantics.


*******************
Biography:

Dr. Marine Carpuat is a Research Officer at the National Research Council
Canada, where she works on natural language processing and statistical
machine translation. Before joining the NRC, Marine was a postdoctoral
researcher at Columbia University in New York. She received a PhD in
Computer Science from the Hong Kong University of Science & Technology
(HKUST) in 2008, an MPhil in Electrical Engineering also from HKUST in
2002, and a Diplôme d'Ingénieur from the French Grande Ecole Supélec.