Integrating relational databases with the Semantic Web
An early vision in Computer Science was to create intelligent systems ca- pable of reasoning on large amounts of data. Independent results in the areas of Description Logic and Relational Databases have advanced us towards this vision. Description Logic research has advanced the understanding of the tradeoff between the computational complexity of reasoning and the expressiveness of logic languages, and now underpins the Semantic Web. The Semantic Web comprises a graph data model (RDF), an ontology language for knowledge representation and reasoning (OWL) and a graph query language (SPARQL). Database research has advanced the theory and practice of management of data, embodying features such as views and recursion which are capable of representing reasoning. Despite the independent advances, the interface between Relational Databases and Semantic Web is poorly understood. This dissertation revisits this vision with respect to current technology and addresses the following question: How and to what extent can Relational Databases be integrated with the Semantic Web? The thesis is that much of the existing Relational Database infrastructure can be reused to support the Semantic Web. Two problems are studied. Can a Relational Database be automatically virtualized as a Semantic Web data source? This paradigm comprises a single Relational Database. The first contribution is an automatic direct mapping from a Relational Database schema and data to RDF and OWL. The second contribution is a method capable of evalu- ating SPARQL queries against the Relational Database, per the direct mapping, by exploiting two existing relational query optimizations. These contributions are embodied in a system called Ultrawrap. Empirical analysis consistently yield that SPARQL query execution performance on Ultrawrap is comparable to that of SQL queries written directly for the relational representation of the data. Such results have not been previously achieved. Can a Relational Database be mapped to existing Semantic Web ontologies and act as a reasoner? This paradigm comprises an OWL ontology including inheritance and transitivity, a Relational Database and mappings between the two. A third contribution is a method for Relational Databases to support inheritance and transitivity by compiling the ontology as mappings, implementing the mappings as SQL views, using SQL recursion and optimizing by materializing a subset of views. This contribution is implemented in an extension of Ultrawrap. Empirical analysis reveals that Relational Databases are able to effectively act as reasoners.