PhD Qualifying Examination "A Survey of Word Sense Disambiguation" By Mr. Weifeng Su Abstract: We compare, contrast, and critique recent Word Sense Disambiguation (WSD) models employing supervised, slightly supervised and unsupervised machine learning models. We first compare nine supervised learning methods, and find that there is no universal best method for WSD applications. Different machine learning methods should be used depending on the specific characteristics of the WSD application. We then compare slightly supervised approaches, and we find that bilingual-bootstrapping outperforms monolingual-bootstrapping in accuracy by utilizing a second language corpus. Finally we compare unsupervised approaches, and find that a method based on Roget's thesaurus performs well on concrete nouns whereas a method based on a second language corpus usually performs better on verbs and adjectives. Thus, what WSD model is best is dependent on the type of target word. The area of WSD currently remains in a dilemma. On one hand, although supervised WSD can achieve high disambiguation precision, it requires impractically large annotated corpora containing sense labeled training instances. On the other hand, the precision achieved by slightly supervised and unsupervised methods is far from satisfactory. This dilemma prevents WSD from being applicable in many NLP applications. Date: Thursday, January 29,2004 Time: 3:00p.m.-5:00p.m. Venue: Room 2302 lifts 17-18 Committee Members: Prof. Dekai Wu (Supervisor) Prof. Fangzhen Lin (Chairperson) Prof. Dit-Yan Yeung Prof. Pascale Fung **** ALL are Welcome ****