Mastro-i: Efficient integration of relational data through DL ontologies
De Giacomo, G
MetadataShow full item record
The goal of data integration is to provide a uniform access to a set of heterogeneous data sources, freeing the user from the knowledge about where the data are, how they are stored, and how they can be accessed. One of the outcomes of the research work carried out on data integration in the last years is a clear conceptual architecture, comprising a global schema, the source schema, and the mapping between the source and the global schema. In this paper, we present a comprehensive approach to, and a complete system for, ontology-based data integration. In this system, called Mastro-i, the global schema is expressed in terms of a TBox of the tractable Description Logics DL-LiteA , the sources are relations, and the mapping language allows for expressing GAV sound mappings between the sources and the global schema. The mapping language has specific mechanisms for addressing the so-called impedance mismatch problem, arising from the fact that, while the data sources store values, the instances of concepts in the ontology are objects. Since in data integration we often aim at integrating large amount of data, the data complexity (i.e., the complexity with respect to the size of source data) of query answering is of particular interest. By virtue of the careful design of the various languages used in Mastro-i, our system is able to answer unions of conjunctive queries through a very efficient technique (LOGSPACE with respect to data complexity) which reduces this task to standard SQL query evaluation. We also show that even very slight extensions of the expressive abilities of Mastro-i lead beyond this complexity bound.