Abstract
We show how JSON documents can be abstracted as concept descriptions in an appropriate description
logic. This representation allows the use of additional background knowledge in the form of a TBox and
an assignment of referring expression types (RETs) to certain primitive concepts to detect situations in
which subdocuments, perhaps multiple subdocuments located in various parts of the original documents,
capture information about a particular conceptual entity. Detecting such situations allows for normalizing
the JSON document into several separate documents that capture all information about such conceptual
entities in separate documents. This transformation preserves all the original information present in the
input documents. The RET assignment contributes a set of possible concept descriptions that enable more
refined and normalized capture of documents, and to more crafted answers to queries that adhere to user
expectations expressed as RETs. We also show how RETs allow checking for a document admissibility
condition ensuring that each document describes a single conceptual entity.