More about HKUST
Bayesian Co-Training: Concepts and Extensions
Speaker: Dr. Shipeng YU Siemens Medical Solutions USA, Inc. Title: "Bayesian Co-Training: Concepts and Extensions" Date: Monday, 31 March 2008 Time: 11:00 am - 12 noon Venue: Room 3315 (via lifts 17/18) HKUST Abstract: Co-training is a popular algorithm for semi-supervised classification and has been applied to many real world problems. When the input data have multiple representations or views (e.g. each web page has the text representation as one view and the hyperlinks from other pages as another view), co-training works by iteratively labeling some unlabeled data using a classifier trained on each view, and enlarging the training set. In this talk we present our recent work on Bayesian co-training, which is an undirected graphical model for co-training. The model clarifies some previously unclear assumptions about co-training, and takes the standard co-training and many of its extensions (e.g. co-regularization) as special cases. A co-training kernel will also be introduced in a Gaussian process (GP) framework, which allows efficient learning with one-step, globally optimal solution. Extensions of Bayesian co-training will also be discussed, which include: 1) the Bayesian co-training framework with missing view information; 2) active view acquisition when we are allowed to select a previously unobserved (data, view) pair to acquire such that the overall performance is optimized. Experiments on web page classification and some medical applications will be presented at the end of the talk. ********************** Biography: Shipeng YU is currently a staff scientist at Siemens Medical Solutions USA, Inc. He received his B.Sc. and M.Sc. degrees in mathematics from Peking University in 2000 and 2003, respectively, and finished his Ph.D. in computer science at University of Munich in Germany in 2006. He has been working on many areas of statistical machine learning, such as Gaussian processes, Dirichlet processes, probabilistic dimensionality reduction, ordinal regression and semi-supervised learning. He is also interested in machine learning applications in data mining, information and image retrieval, and user modeling.