More about HKUST
Probabilistic XML: Survey and Challenges
-------------------------------------------------------------------- The Hong Kong University of Science & Technology Department of Computer Science and Engineering Human Language Technology Center -------------------------------------------------------------------- Speaker: Dr. Pierre SENELLART Computer Science and Networking Department Télécom ParisTech France Title: "Probabilistic XML: Survey and Challenges" Date: Tuesday, 10 November 2009 Time: 11:00am - 12 noon Venue: Room 2404 (via lifts 17/18), HKUST Abstract: A large number of automatic tasks on real-world data generate imprecise results, e.g., information extraction, natural language processing, data mining. Moreover, in many of these tasks, information is represented in a semi-structured way, either due to an inherent tree-like structure of the original data, or because it is natural to represent derived information or knowledge in a hierarchical manner. A number of recent works have dealt with representing uncertain information in XML. We present a unifying model for these works, distinguishing two main classes of frameworks, depending whether arbitrary probabilistic dependencies are allowed or not. For these two classes, we discuss expressiveness, query efficiency, update capabilities. We also go over more recent work on the use of continuous probabilistic distributions. Finally, we aim at providing insight into the important open problems of probabilistic XML, by discussing the connection with relational database models, the limitations of existing frameworks, and other topics of interest. ******************* Biography: Dr. Pierre Senellart is an Associate Professor in the Computer Science and Networking Department at Télécom ParisTech, the French leading engineering school specialized in information technology. He obtained his M.Sc. (2003) and his Ph.D. (2007) in Computer Science from Université Paris-Sud, studying under the supervision of Serge Abiteboul. Pierre Senellart has published in internationally renowned conferences and journals (PODS, AAAI, VLDB Journal, etc.) He has been a member of the program committee of ECML/PKDD, WWW, VLDB, ICDE, a member of the repeatability committee of SIGMOD, and has performed reviews for various journals, such as VLDB Journal, JCSS, DKE, Information Systems, and Communications of the ACM. His research interests focus around theoretical aspects of database management systems and the World Wide Web, and more specifically on the intentional indexing of the deep Web, probabilistic XML databases, and graph mining. He also has an interest in natural language processing, and has been collaborating with SYSTRAN, the leading machine translation company.