More about HKUST
Large-scale Data Mining and its Applications to Information Retrieval
Speaker: Professor Edward Chang Google Research China Title: "Large-scale Data Mining and its Applications to Information Retrieval" Date: Monday, 2 November 2009 Time: 10:30am - 11:30am Venue: Room 2404 (via lifts 17/18), HKUST Abstract: Confucius is a great teacher in ancient China. His theories and principles were effectively spread throughout China by his disciples. Confucius is the product code name of Google's Knowledge Search product, which is built at Google Beijing lab by my team. In this talk, I present Knowledge Search's key disciples, which are data management subroutines that generate labels for questions, that match existing answers to a question, that evaluate quality of answers, that rank users based on their contributions, that distill high-quality answers for search engines to index, etc. This talk presents scalable algorithms that we have developed to make these disciples effective in dealing with huge datasets. Efforts in making these algorithms run even faster on thousands of machines, and some open research problems will also be presented. ****************** Biography: Edward Chang heads Google Research in China since March 2006. He joined the department of Electrical & Computer Engineering at University of California, Santa Barbara, in 1999 after receiving his PhD from Stanford University. Ed received his tenure in 2003, and was promoted to full professor of Electrical Engineering in 2006. His recent research activities are in the areas of distributed data mining and their applications to rich-media data management and social-network collaborative filtering. His research group (which consists of members from Google, UC, MIT, Tsinghua, PKU, and Zheda) recently parallelized SVMs (NIPS 07), PLSA (KDD 08), Association Mining (ACM RS 08), Spectral Clustering (ECML 08), and LDA (WWW 09) (see MMDS/CIVR keynote slides for details) to run on thousands of machines for mining large-scale datasets. Ed has served on ACM (SIGMOD, KDD, MM, CIKM), VLDB, IEEE, WWW, and SIAM conference program committees, and co-chaired several conferences including MMM, ACM MM, ICDE, and WWW. Ed is a recipient of the IBM Faculty Partnership Award and the NSF Career Award.