Non-parametric Topic Models and Variants

========================================================================
                Joint Seminar
========================================================================
The Hong Kong University of Science & Technology
Dept. of Computer Science and Engineering
Human Language Technology Center
Dept. of Electronic and Computer Engineering
Dept. of Information Systems, Business Statistics & Operations Management
Dept. of Mathematics
------------------------------------------------------------------------

Speaker:        Professor Wray Buntine
                Monash University

Title:          "Non-parametric Topic Models and Variants"

Date:           Monday, 14 December 2015

Time:           4:00pm - 5:00pm

Venue:          Lecture Theater H (near lifts 27 & 28), HKUST

Abstract:

The output of topic models has always been seductive but not quite
satisfying ever since the early work of Hofmann (PLSI) and Lee and Seung
(NMF). This talk argues that topic models themselves need attention. New
ways of modelling document semantics are being explored in the field of
deep neural networks. Similarly, non-parametric versions of topic models
allow modelling such effects as document structure, word sparsity, word
burstiness, background words, multi-word terms, and network effects from
author or follower networks, and semantic word hierarchies. These are
usually done in the spirit of deep neural networks using hierarchical
models, but earlier algorithms were often too slow to be realistic.  This
talk will start with a brief tour of some of the variants, which can only
be superficial given the huge number. This will be followed by a brief
tour of some non-parametric methods known to be moderately efficient and
suiting multi-core implementation.  The talk will then present
experimental results on various versions of topic models to see how they
can mitigate some of the unwanted artifacts of simple LDA.


******************
Biography:

Wray Buntine joined Monash University as a professor in February 2014
after 7 years at NICTA in Canberra Australia.  He is a co-director of the
Machine-Learning Flagship and director of the Master of Data Science.  He
was previously of Helsinki Institute for Information Technology from 2002,
and at NASA Ames Research Center, University of California, Berkeley, and
Google. He is known for his theoretical and applied work in document and
text analysis, data mining and machine learning, and probabilistic
methods.  He helped found machine learning and statistical areas such as
Bayesian model averaging, relational learning, and graphical models for
machine learning.  He reviews for conferences such as ACML, SIGIR, ACL,
ECML-PKDD, ICML, NIPS, UAI, and KDD, and is on the editorial board of Data
Mining and Knowledge Discovery.  He distributes the HCA topic modelling
suite, the only multi-core non-parametric topic modelling software with
comparable speed to parametric software.