More about HKUST
Hexastore: Sextuple Indexing for Semantic Web Data Management
Speaker: Dr. Panagiotis Karras National University of Singapore Title: "Hexastore: Sextuple Indexing for Semantic Web Data Management" Date: Friday, 3 October 2008 Time: 2:00 - 3:00pm Venue: Room 4502 (via lifts 25/26), HKUST Abstract: Despite the intense interest towards realizing the Semantic Web vision, most existing RDF data management schemes are constrained in terms of efficiency and scalability. Still, the growing popularity of the RDF format arguably calls for an effort to offset these drawbacks. Viewed from a relational database perspective, these constraints are derived from the very nature of the RDF data model, which is based on a triple format. Recent research has attempted to address these constraints using a vertical-partitioning approach, in which separate two-column tables are constructed for each property. However, as we show, this approach suffers from similar scalability drawbacks on queries that are not bound by RDF property value. In this paper, we propose an RDF storage scheme that uses the triple nature of RDF as an asset. This scheme enhances the vertical partitioning idea and takes it to its logical conclusion. RDF data is indexed in six possible ways, one for each possible ordering of the three RDF elements. Each instance of an RDF element is associated with two vectors; each such vector gathers elements of one of the other types, along with lists of the third-type resources attached to each vector element. Hence, a sextupleindexing scheme emerges. This format allows for quick and scalable general-purpose query processing; it confers significant advantages (up to five orders of magnitude) compared to previous approaches for RDF data management, at the price of a worst-case five-fold increase in index space. We experimentally document the advantages of our approach on real-world and synthetic data sets with practical queries. Biography: Panagiotis Karras is a Lee Kuan Yew Postdoctoral Fellow at the National University of Singapore. He received a Ph.D. in Computer Science from the University of Hong Kong in and an M.Eng. in Electrical and Computer Engineering from the National Technical University of Athens. He has also worked and studied at the University of Zurich, at the Technical University of Denmark, at the Institute of Language and Speech Processing in Athens, at Schlumberger Information Solutions in Oslo, at the University of Karlsruhe, Germany, and at the University of Patras, Greece. His research interests are in the design and analysis of algorithms and data structures for massive data management, data stream algorithms, geometric and spatial data management problems, data anonymization, and indexing methods for semi-structured data. His work has been published in major data engineering and data mining conferences.