MEANT: A Highly Accurate Semantic Frame Based Evaluation Metric for Improving Machine Translation Utility

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "MEANT: A Highly Accurate Semantic Frame Based Evaluation Metric for 
Improving Machine Translation Utility"

By

Miss Chi Kiu LO


Abstract

We show that the quality of machine translation (MT) output of different genres 
(written news, public speech, etc.), for different output languages (Chinese 
and English), from different MT paradigms (phrase-based and hierarchical) and 
using different optimization strategies is consistently improved by guiding the 
MT system to preserve the meaning of the input sentence using our novel 
semantic frame based objective function, MEANT, that better reflects 
translation utility than the commonly used surface form based objective 
function, such as BLEU.

Current MT systems are often able to output fluent, nearly grammatically 
correct translations with roughly the correct words but still make glaring 
errors caused by confusion of semantic roles and fail to express the original 
meaning of the input. A useful translation should be one that helps its reader 
to understand the original meaning of the input utterance accurately. However, 
over the past decade, the development of MT systems has been driven by BLEU and 
other fast and cheap n-gram surface-form matching based MT evaluation metrics 
which fail to reflect translation utility – how useful is a translation, or in 
other words, how accurate can human readers understand the original meaning of 
the input utterance. Even when human judgment clearly indicates that a 
translation has serious mistakes in conveying the meaning of the input 
utterance, n-gram surface-form matching based evaluation metrics typically 
register little difference. Frame semantics capture the essential meaning of a 
sentence in the basic event structure – “who did what to whom, for whom, when, 
where, why and how”. As the performance of MT systems have plateaued, we argue 
that it is time for a new semantic frame based MT evaluation metric that 
focuses on reflecting the degree of correctness in meaning of the translation 
to drive MT systems to produce more adequate and useful translations.

In this thesis, we first introduce HMEANT, a human-involved semi-automatic 
semantic frame based MT evaluation metric, that correlates better with human 
judgment on translation adequacy than not only the automatic MT evaluation 
metrics, but also HTER, the state-of-the-art semi-automatic adequacy-oriented 
MT evaluation metric, at a lower labor cost. We go on to fully automate HMEANT 
into MEANT and show that MEANT correlates better or as well with human adequacy 
judgment than the state-of-the-art automatic MT evaluation metrics in scoring 
the quality of the MT output against the human reference translation for a wide 
range of output languages, Czech, English, German, French, Hindi, Romanian and 
Russian, with fewer language-dependent resources and higher score 
interpretability. We then show XMEANT, the cross-lingual variant of MEANT that 
approximates MEANT by scoring the quality of the MT output against the input 
sentence when the costly human reference translation are not available for MT 
evaluation. Most importantly, we empirically demonstrate that MT system 
optimized against MEANT show improved translation quality, in terms of the most 
commonly used automatic MT evaluation metrics across different genres, language 
pairs, MT paradigms and optimization strategies.


Date:			Monday, 21 May 2018

Time:			4:00pm - 6:00pm

Venue:			Room 4472
 			Lifts 25/26

Chairman:		Prof. Hai Yang (CIVL)

Committee Members:	Prof. Dekai Wu (Supervisor)
 			Prof. Brian Mak
 			Prof. Dit-Yan Yeung
 			Prof. Ming Liu (ECE)
 			Prof. Preslav Nakov (Hamad Bin Khalifa University)
 			Prof. Pierre Nugues (Lund University)


**** ALL are Welcome ****