More about HKUST
A Survey on Modeling Word Burstiness
PhD Qualifying Examination Title: "A Survey on Modeling Word Burstiness" by Mr. Di JIANG Abstract: Multinomial distributions are often used to model text documents. However, they do not capture well the phenomenon that words in a document tend to appear in burstiness: if a word appears once, it is more likely to appear again. The failure of capturing burstiness hinders the conventional models' wide application in information retrieval and natural language processing. We recognize that a critical review of existing models is needed in order to design and develop better paradigms that are able to match the diverse challenging issues that rise in burstiness-aware document modeling. Within a unifying set of notations and terminologies, we describe in this paper the eorts and main techniques for modeling word burstiness and present a comprehensive survey of a number of the state-of-the-art approaches. We classify the burstiness models into three major categories based on the techniques they adopt in order to provide insights into how and why the techniques are eective. We also discuss several real-world applications in which the burstiness-aware models demonstrate superior performance compared to the multinomial distributions. Date: Thursday, 2 August 2012 Time: 2:00pm - 4:00pm Venue: Room 3501 lifts 25/26 Committee Members: Dr. Wilfred Ng (Supervisor) Prof. Shing-Chi Cheung (Chairperson) Prof. Dik-Lun Lee Dr. Raymond Wong **** ALL are Welcome ****