Abstract
Predictive analysis based on time-series data increasingly gains importance in real-world applications such as turbine diagnostics, biomedical decision support, weather forecasting. For instance, the service engineers at Siemens diagnostic centres unveil hidden knowledge in huge amounts of historical sensor data and use
this knowledge to improve the predictive systems analysing live data. Currently, the analysis is usually done using data-dependent rules that are specific to individual sensors and equipment. This dependence poses significant challenges in rule authoring, reuse, and maintenance by engineers. One solution to this problem is
to employ ontology-based data access (OBDA), which provides a conceptual view of data via an ontology without moving and converting the data itself. The aim of this thesis is to extend the classical OBDA paradigm so as to provide access to time series and carry out temporal reasoning over them.
The first contribution of the thesis is a novel temporal rule-based ontology language to model time series. Our language is the datalog extension MTLdatalog of the Horn fragment of the metric temporal logic MTL. The reason for choosing MTL instead of LTL is that MTL is more compact and better suitable for dealing with
data sent asynchronously and irregularly by multiple sensors. Although reasoning in full MTL is known to be undecidable, the complexity analysis shows that MTLdatalog enjoys good computational properties. In fact, reasoning in MTLdatalog is ExpSpace-complete in combined complexity and P-hard in data complexity. Moreover, reasoning in non-recursive MTLdatalog is PSpace-complete in combined complexity and in AC0
in data complexity.
The second contribution is a temporal OBDA framework that can incorporate both static and temporal knowledge. The temporal OBDA framework is obtained by enriching the classical OBDA framework with a temporal mapping component to extract information about temporal events, and a set of temporal rules based on MTLdatalog to describe temporal patterns. We also propose a SPARQL-based query language, and show that query answering in the temporal OBDA framework can be reduced to SQL query evaluation.
The third contribution is a practical tool, called Ontop-temporal, implementing the temporal OBDA framework, which is developed as an extension of the state-ofthe-art OBDA system Ontop. The query evaluation algorithm of Ontop-temporal is based on the ideas of temporal mapping saturation and query translation.
As the fourth contribution of this thesis, we show the usefulness of temporal OBDA and of the Ontop-temporal system in an example use case, where we model complex medical knowledge and facilitate the access to the MIMIC-III critical care unit dataset containing log data on hospital admissions, procedures, and diagnoses.
Finally our performance evaluation confirms that our approach is scalable and can handle large amounts of data.