Discriminative Signature Mining - Solving an NP-hard problem by Divide-Conquer Search, with applications to finance, intrusion detection, and bioinformatics

Speaker:	Dr. Wei FAN
		IBM T.J.Watson Research

Title:		"Discriminative Signature Mining - Solving an NP-hard
		 problem by Divide-Conquer  Search, with applications
		 to finance, intrusion detection, and bioinformatics"

Date:		Wednesday, 28 October, 2009

Time:		11:00am - 12 noon

Venue:		Room 3408 (via lifts. 17/18), HKUST

Abstract:

Discriminative signature mining or feature construction from raw data
(unstructured and semi-structured, etc) is important, since
state-of-the-art learning techniques take data in feature vector format as
input. If there is no feature vector, there is basically nothing to model.
We discuss how to use frequent patterns (itemsets, sequence, graphs, etc)
to mine discriminative signatures from different forms of raw data (for
example, financial transactions, graph database, DNA sequence, intrusion
detection sequence, etc). We discuss that discriminative signature mining
from frequent patterns is an NP-hard problem. We discuss an efficient and
accurate solution based on divide-conquer search, that can find those
highly discriminative and generalizable patterns that are impossible to be
mined by any of the existing approaches. We discuss its application in
finance, intrusion detection, bioinformatics and pharmaceuticals.


**********************
Biography:

Wei FAN received his PhD in Computer Science from Columbia University in
2001 and has been working in IBM T.J.Watson Research since 2000. His main
research interests and experiences are in various areas of data mining and
database systems, such as, risk analysis, high performance computing,
extremely skewed distribution, cost-sensitive learning, data streams,
ensemble methods, easy-to-use nonparametric methods, graph mining,
predictive feature discovery, feature selection, sample selection bias,
transfer learning, novel applications and commercial data mining systems.
More information can be found at http://www.cs.columbia.edu/~wfan