Feature Selection for Big Data with Trillion Dimensions

Speaker:        Dr. Ivor Tsang
                Nanyang Technological University
                Singapore

Title:          "Feature Selection for Big Data with Trillion Dimensions"

Date:           Monday, 3 March 2014

Time:           4:00pm - 5:00pm

Venue:          Lecture Theater F (near lifts 25/26), HKUST

Abstract:

The world continues to generate quintillion bytes of data daily, leading
to the pressing needs for new efforts in dealing with the grand challenges
brought by Big data. Today, there is consensus among machine learning and
data mining communities that data volume presents an immediate challenge
pertaining to the scalability issue. However, when addressing volume in
Big data analytics, researchers have taken a one-sided study of volume,
which is the "Big instance size" factor of the data. The flip side of
volume which is the dimensionality factor of Big data, on the other hand,
has received much lesser attention. In this talk, I will present an
attempt to fill in this gap and places special focus on this relatively
under-explored topic of ultrahigh dimensionality. Specifically, I first
reformulate the resultant non-convex problem as a convex semi-infinite
programming (SIP) problem, and then present an efficient feature
generating paradigm to solve it. The proposed feature generating paradigm
is guaranteed to converge globally under mild conditions. In addition, it
can achieve lower feature selection bias compared with the L1-regularized
methods. To speed up the training on big data (w.r.t. dataset size),
several speedup strategies are explored under the proposed feature
generating paradigm. Comprehensive experiments on a wide range of
synthetic and real-world datasets with tens of million data points and
O(10^14) dimensions demonstrate that the proposed method achieves superb
performances compared with state-of-the-art feature selection methods in
terms of generalization performance and training efficiency.

*****************
Biography:

Ivor W. Tsang will join the Centre for Quantum Computation & Intelligent
Systems (QCIS), University of Technology, Sydney (UTS) as Australian
Future Fellow and Associate Professor. Before joining UTS, he was the
Deputy Director of the Center for Computational Intelligence, Nanyang
Technological University, Singapore. He received his Ph.D. degree in
computer science from the Hong Kong University of Science and Technology
in 2007. His research focuses on kernel methods, transfer learning,
feature selection, big data analytics for data with millions of
dimensions, and their applications to computer vision and pattern
recognition. He has more than 100 research papers published in refereed
international journals and conference proceedings, including 4 JMLR, 8
T-PAMI, 18 T-NN, 12 ICML, NIPS, UAI, AISTATS, SIGKDD, IJCAI, AAAI, ICCV,
CVPR, ECCV, etc.

Dr. Tsang received the prestigious Australian Research Council Future
Fellowship in 2013, the IEEE Transactions on Neural Networks Outstanding
2004 Paper Award in 2006, and the second class prize of the National
Natural Science Award, China in 2009. His research also earned him the
Best Student Paper Award at CVPR'10, the Best Paper Award at ICTAI'11, the
Best Poster Honorable Mention at ACML'12. He was also conferred with the
Microsoft Fellowship in 2005.