Efficient Learning of Hierarchical Naive Bayes Model

MPhil Thesis Defence


Title: "Efficient Learning of Hierarchical Naive Bayes Model"

By

Miss MingXing CHEN


Abstract

Hierarchical naive Bayes (HNB) model is useful in latent variable 
discovery and classification. It introduces latent variables to a naive 
Bayes (NB) model to represent the potentional conditional dependences 
among attribute variables that are not captured by the class variable. 
Hence HNBs can model more complex dependencies among the attributes and 
alleviate the disadvantages of NB models for classification.

More formally, HNB models are tree-shaped Bayesian networks (BNs) where 
the root represents the class variable, leaf nodes represent attributes, 
while internal nodes represent latent variables.

This thesis is concerned with the problem of learning HNB models from the 
data. General learning algorithms for HNB models are computational 
expensive and thus cannot be applied to large scale problems. In this 
thesis, we propose a new efficient algorithm. Our new algorithm builds the 
model structure in a bottom-up procedure. At each iteration, it fist 
applies the conditional mutual information (CMI) to measure the 
correlations among variables given the class variable. Then it selects a 
strongly correlated subset of variables using the Unidimentionality Test 
(UT) and adds a latent variable as their common parent. Specially, we 
follow these two key points (i.e., CMI and UT) and call our new algorithm 
as CMI-UT.

We empirical study CMI-UT on different data sets, synthetic data, UCI 
data, and real-world data. The results show that CMI-UT is significantly 
faster than previous algorithm, particularly, the difference is greater 
when the sample size of data is larger. At the meanwhile, it does not 
significantly compromise the model quality, its performance in latent 
variable discovery and classification problem.


Date:			Friday, 30 July 2010

Time:			10:00am – 12:00noon

Venue:			Room 3501
 			Lifts 25/26

Committee Members:	Prof. Nevin Zhang (Supervisor)
 			Prof. Dit-Yan Yeung (Chairperson)
 			Dr. Raymond Wong


**** ALL are Welcome ****