More about HKUST
Non-parametric Topic Models and Variants
======================================================================== Joint Seminar ======================================================================== The Hong Kong University of Science & Technology Dept. of Computer Science and Engineering Human Language Technology Center Dept. of Electronic and Computer Engineering Dept. of Information Systems, Business Statistics & Operations Management Dept. of Mathematics ------------------------------------------------------------------------ Speaker: Professor Wray Buntine Monash University Title: "Non-parametric Topic Models and Variants" Date: Monday, 14 December 2015 Time: 4:00pm - 5:00pm Venue: Lecture Theater H (near lifts 27 & 28), HKUST Abstract: The output of topic models has always been seductive but not quite satisfying ever since the early work of Hofmann (PLSI) and Lee and Seung (NMF). This talk argues that topic models themselves need attention. New ways of modelling document semantics are being explored in the field of deep neural networks. Similarly, non-parametric versions of topic models allow modelling such effects as document structure, word sparsity, word burstiness, background words, multi-word terms, and network effects from author or follower networks, and semantic word hierarchies. These are usually done in the spirit of deep neural networks using hierarchical models, but earlier algorithms were often too slow to be realistic. This talk will start with a brief tour of some of the variants, which can only be superficial given the huge number. This will be followed by a brief tour of some non-parametric methods known to be moderately efficient and suiting multi-core implementation. The talk will then present experimental results on various versions of topic models to see how they can mitigate some of the unwanted artifacts of simple LDA. ****************** Biography: Wray Buntine joined Monash University as a professor in February 2014 after 7 years at NICTA in Canberra Australia. He is a co-director of the Machine-Learning Flagship and director of the Master of Data Science. He was previously of Helsinki Institute for Information Technology from 2002, and at NASA Ames Research Center, University of California, Berkeley, and Google. He is known for his theoretical and applied work in document and text analysis, data mining and machine learning, and probabilistic methods. He helped found machine learning and statistical areas such as Bayesian model averaging, relational learning, and graphical models for machine learning. He reviews for conferences such as ACML, SIGIR, ACL, ECML-PKDD, ICML, NIPS, UAI, and KDD, and is on the editorial board of Data Mining and Knowledge Discovery. He distributes the HCA topic modelling suite, the only multi-core non-parametric topic modelling software with comparable speed to parametric software.