More about HKUST
A Survey on Probabilistic Topic Modeling
PhD Qualifying Examination Title: "A Survey on Probabilistic Topic Modeling" by Miss Peixian CHEN Abstract: The booming volume of digitized document collections makes it increasingly difficult to find desired information within reasonable time. A great many computational or analysing tools have been developed to meet the challenge. Topic modeling is currently the most widely-used one with underlying semantic interpretation. It aims at discovering patterns in the use of words that can be utilized to organize and summarize documents in a corpus. In this paper, we give a survey of probabilistic topic modeling. In probabilistic topic modeling, topics are modeled as probabilistic distributions over a vocabulary. The documents are assumed to have been produced from a list of unobserved topics through a probabilistic generative process. Statistical inference is performed to invert the generative process and identify the topics. In this survey, we first discuss the basic probabilistic topic models as well as the associated inference algorithms. Then we concentrate on extensions to the basic models that consider the modeling of topic correlations, the automatic determination of the number of topics, and the evolution of topics over time. Evaluation methods, along with a comparison of above probabilistic topic models, will be also presented. Date: Wednesday, 23 April 2014 Time: 10:00am - 12:00noon Venue: Room 5563 Lifts 27/28 Committee Members: Prof. Nevin Zhang (Supervisor) Prof. Dit-Yan Yeung (Chairperson) Prof. Dik-Lun Lee Dr. Raymond Wong **** ALL are Welcome ****