PhD Qualifying Examination "Survey: Statistical Machine Translation and Example-Based Machine Translation" By Mr. Yihai Shen Abstract: We present, contrast and critique two major paradigms within corpus-based machine translation (MT), namely statistical machine translation (SMT) and example-based machine translation (EBMT). We first present the major techniques utilized in translation models in SMT in the order of word-based translation model, phrase-based model and syntax-based model. We find that translation models in SMT have evolved in a way to incorporate more and more linguistic information, especially structural aspect of the language, which is very important, especially for language pairs that differ much in structure, for speed and accuracy considerations. We then present the EBMT paradigm, its three major steps being matching, alignment and recombination. We contrast the major techniques used in these three steps. We observe that EBMT works well for cases that syntactic information (rules) serves badly. Therefore we think it promising to combine the strength of syntax-based SMT systems and EBMT under one framework in which EBMT does the work that is not suitable for syntax-based SMT. Possible approaches include applying stochastic methods to EBMT systems and using multi-engine systems. Date: Tuesday, 10 February 2004 Time: 9:00a.m.-11:00a.m. Venue: Room 3401 lifts 17-18 Committee Members: Prof. Dekai Wu (Supervisor) Prof. Brian Mak (Chairperson) Prof. Hongjun Lu Prof. Pascale Fung (ELEC) **** ALL are Welcome ****