Machine Learning for Bioinformatic Data Mining

Speaker:	Professor S.Y. Kung
		Princeton University
		USA

Title:		"Machine Learning for Bioinformatic Data Mining"

Date:		Thursday, 23 March 2006

Time:		2:00pm - 3:00pm

Venue:		Room 3301 (via lift nos. 17/18)
		HKUST

ABSTRACT:

Genomic bioinformatics represents a natural convergence of life science
and information science. The DNA sequencing and expression profiling
represent two main modalities of genomic information sources. The genome
is not just a collection of genes working in isolation, but it encompasses
global and highly coordinated control of information to carry out a range
of cellular functions. Therefore, it is imperative to conduct a
genome-wide exploration. Note that genome-wide analysis via pure DNA
sequencing is computationally prohibitive. In contrast, expression of
several thousands of genes can be measured simultaneously by DNA
microarrays, thus permitting discovery of clusters of correlated genes.
It is obvious that microarray data analysis will play a vital role in the
future genome-wide bioinformatic study.

It is crucial to know not only how to cluster data but also how to find
appropriate way of looking at the genomic data. In other words,
extraction of relevant features is critical for cluster discovery. We
shall present a comprehensive set of coherence models to better capture
the biological relevant features of genes. In addition, we adopt as the
classification architecture several existing neural networks, e.g. SVM or
decision-based neural network (DBNN).  Our fusion model is built upon the
classic mixture-of-experts (MOE) architecture: (1) a local expert is
assigned to cover each modality; (2) a gating agent is then adopted to
fuse the local scores to reach a Bayesian optimal decision.  Based on the
standard yeast data base, the proposed machine learning/fusion system
yields satisfactory performance in predicting several well-studied yeast
gene groups e.g. ribosomal and molecular activities genes.

With massive amount of data having to be analyzed, genomic study will
become inevitably dependent on advanced machine learning techniques. On
the other hand, any computationally-based genomic prediction remains
untrustworthy until a careful and laborious biological verification is
performed. This points to an increasingly symbiotic relationship between
the machine learning and genomic technologies.



************************
Biography:

Professor S.Y. Kung received his Ph.D. Degree in Electrical Engineering
from Stanford University in 1977. He was an Associate Engineer of Amdahl
Corporation, Sunnyvale, 1974, and a Professor of Electrical
Engineering-Systems of the University of Southern California, (1977-1987).
Since 1987, he has been a Professor of Electrical Engineering at the
Princeton University.

He held a Visiting Professorship at the Stanford University (1984); and a
Visiting Professorship at the Delft University of Technology (1984); a
Toshiba Chair Professorship at the Waseda University,  Japan (1984); an
Honorary Professorship at the Central China University of  Science and
Technology (1994); and a Distinguished Chair Professorship at the Hong
Kong Polytechnic University (2001-2003). His research interests include
VLSI array processors, system modelling and identification, neural
networks, wireless communication, sensor array processing, multimedia
signal processing, bioinformatic data mining and biometric authentication.

Professor Kung is a Fellow of IEEE since 1988.  He served as a Member of
the Board of Governors of the IEEE Signal Processing Society (1989-1991).
He was a founding member of several Technical Committees (TC) of the IEEE
Signal Processing Society, including VLSI Signal Processing TC (1984),
Neural Networks for Signal Processing TC (1991) and Multimedia Signal
Processing TC (1998), and was appointed as the first Associate Editor in
VLSI Area (1984) and later the first Associate Editor in Neural Network
(1991) for the IEEE Transactions on Signal Processing. He presently serves
on Technical Committees on Multimedia Signal Processing.  Since 1990, he
has been the Editor-In-Chief of the Journal of VLSI Signal Processing
Systems.

Professor Kung has co-authored more than 400 technical publications and
numerous textbooks including  "VLSI and Modern Signal Processing," with
Russian translation, Prentice-Hall (1985), "VLSI Array Processors", with
Russian and Chinese translations, Prentice-Hall (1988); "Digital Neural
Networks'', Prentice-Hall (1993) ; "Principal Component Neural Networks'',
John-Wiley (1996); and "Biometric Authentication: A  Machine Learning and
Neural Network Approach'', Prentice-Hall (2005).

Professor Kung was a recipient of IEEE Signal Processing Society's
Technical Achievement Award for his contributions on "parallel processing
and neural network algorithms for signal processing" (1992); a
Distinguished Lecturer of IEEE Signal Processing Society (1994); a
recipient of IEEE Signal Processing Society's Best Paper Award for his
publication on principal component neural networks (1996); and a recipient
of the IEEE Third Millennium Medal (2000).