More about HKUST
Discriminative Signature Mining - Solving an NP-hard problem by Divide-Conquer Search, with applications to finance, intrusion detection, and bioinformatics
Speaker: Dr. Wei FAN IBM T.J.Watson Research Title: "Discriminative Signature Mining - Solving an NP-hard problem by Divide-Conquer Search, with applications to finance, intrusion detection, and bioinformatics" Date: Wednesday, 28 October, 2009 Time: 11:00am - 12 noon Venue: Room 3408 (via lifts. 17/18), HKUST Abstract: Discriminative signature mining or feature construction from raw data (unstructured and semi-structured, etc) is important, since state-of-the-art learning techniques take data in feature vector format as input. If there is no feature vector, there is basically nothing to model. We discuss how to use frequent patterns (itemsets, sequence, graphs, etc) to mine discriminative signatures from different forms of raw data (for example, financial transactions, graph database, DNA sequence, intrusion detection sequence, etc). We discuss that discriminative signature mining from frequent patterns is an NP-hard problem. We discuss an efficient and accurate solution based on divide-conquer search, that can find those highly discriminative and generalizable patterns that are impossible to be mined by any of the existing approaches. We discuss its application in finance, intrusion detection, bioinformatics and pharmaceuticals. ********************** Biography: Wei FAN received his PhD in Computer Science from Columbia University in 2001 and has been working in IBM T.J.Watson Research since 2000. His main research interests and experiences are in various areas of data mining and database systems, such as, risk analysis, high performance computing, extremely skewed distribution, cost-sensitive learning, data streams, ensemble methods, easy-to-use nonparametric methods, graph mining, predictive feature discovery, feature selection, sample selection bias, transfer learning, novel applications and commercial data mining systems. More information can be found at http://www.cs.columbia.edu/~wfan