In general, interoperability can be defined as a measure of the degree to which diverse systems, organizations, and/or individuals are able to work together to achieve a common goal [47]. In this work, semantic interoperability is understood as the ability of computer systems to exchange data with unambiguous, shared meaning. Achieving semantic interoperability is complex since semantic interoperability conflicts need to be conciliated. Semantic interoperability conflicts denote differences in the modeling of different and/or equivalent concepts and how these concepts are expressed [5]. In the following, these conflicts are explained [48].
SIC1 – Structuredness: this interoperability conflict occurs whenever data sources are
described at a different level of structuredness, e.g., structured, semi-structured, and unstructured. Structured data sources are represented using schemas of a particular
2.3 Data Integration
data/knowledge model, e.g., the relational data model; all the represented entities are described in terms of fixed schema and attributes. Semi-structured data sources are also described using a data/knowledge model, e.g., RDF or XML; however, in contrast to structured data, each modeled entity can be represented using different attributes and a predefined and fixed schema is not required to describe an entity.
SIC2 – Schematic: this interoperability conflict exists among data sources that are modeled
using a different schema. Conflicts include: i) Different attributes representing the same concept in different sources; ii) the same concept modeled using different structures in the distinct data sources, e.g., attributes versus classes; iii) different types are used to represent the same concept, e.g., string versus integer; iv) the same concept is described at different levels of specialization/generalization; v) different names are used to model the same concept.
SIC3 – Domain: this interoperability conflict occurs when various interpretations of the same
domain are represented. Different interpretations include: i) Homonym: the same name is used to represent concepts with a different meaning; ii) Synonym: distinct names are used to model the same concept; iii) Acronym: different abbreviations for the same concept are employed.
SIC4 – Representation: this interoperability conflict is described when different representa-
tions are used to model the same concept. Representation conflicts include: i) Different scales or units; ii) various values of precision; iii) incorrect spellings.
SIC5 – Language: this interoperability conflict occurs whenever different languages are used
to represent the data or metadata, i.e., schema.
SIC6 – Granularity: this interoperability conflict appears when various interpretations of
the same domain are represented. Different interpretations include: i) Intra-aggregation: the same data is divided differently, e.g., full person names against first-middle-last; ii) Inter-aggregation: appears when there exist sums or counts as added values.
SIC7 – Missing Item: this interoperability conflict occurs whenever different items in distinct
data sources are missing. Missing Item comprises: i) Missing attributes; ii) Missing content.
Semantic Interoperability Conflicts in Industry 4.0 Scenarios
In the following, we describe and exemplify some particular semantic interoperability conflicts in I40 scenarios. Three levels of conflicts are identified: i) standardization frameworks; ii) standards; and iii) documents describing a CPS.
Standardization framework related
SIC1 – Structuredness: The description of the standardization frameworks is commonly made
by means of white papers; thus unstructured sources are used to described standardization frameworks, their layers, levels as well as their relations with standards (cf. [6,7]).
SIC2 – Schematic: The standardization frameworks present schemas for describing functions
Chapter 2 Background and Preliminaries
SIC3 – Domain: Same standards are classified in distinct dimensions and layers in different
standardization frameworks [7].
Standard related
SIC1 – Structuredness: Typically, standards are described in unstructured data sources,
e.g., PDF documents and excel sheets. The standard IEC 61360, an important data dictionary standard for the electro-technical domain13, can be retrieved as an excel sheet. This standard is typically used in combination with the eCl@ss standard dictionary which is available as an unstructured document, i.e., HTML or PDF. Thus, same concepts are described in these standards and the structure used to represent them is not the same.
SIC2 – Schematic: The schemas of the AML and the OPC UA standards are employed to
model the same CPS [9,50].
SIC3 – Domain: Homonym: same terms are described with different meanings in different
standards; e.g., Resource is described in ISO 15704 as follows: An enterprise entity that provides some or all of the capabilities required by the execution of an enterprise and/or business process; whereas in ISO 10303 as something that may be described in terms of a behaviour, a capability, or a performance measure that is pertinent to a given process.
Acronym: different abbreviations are used to refer to the same standard; e.g., IEC 62541
and OPC UA. Synonym: distinct names are utilized to express the same meaning, e.g., an
InternalElement in AML describes the same meaning than an Object in OPC UA. Semantic Heterogeneity Conflicts in CPS Biffl et al. [51] and Kovalenko and Euzenat [52] have characterized semantic heterogeneity conflicts in the engineering domain, i.e., CPS-related. The authors have identified the following types of semantic heterogeneity:
M1 – Value processing: same entities are not modeled equally, the relation between values
of two entities can be modeled by a function taking a value on one side as an input and returns a value on another side as an output. For example, using different string values, datatypes or mathematical functions; This is an instantiation of the SIC4 heterogeneity.
M2 – Granularity: same objects are modeled at different levels of detail; This is an instantiation
of the SIC6 heterogeneity.
M3 – Schematic differences: a divergent way of representing semantics for the same object;
This is an instantiation of the SIC2 heterogeneity.
M4 – Conditional mappings: relations between entities exist only if certain conditions occur;
This can be seen as SIC4.
M5 – Bidirectional mappings: relations between entities have to be defined bidirectionally;
This can be interpreted as SIC4.
M6 – Grouping and aggregation: different semantic modeling criteria are applied to group
elements for the same object; This is an instantiation of the SIC6.
M7 – Restrictions on values: mandatory values for properties in the object that have to be
handled in the mapping process. This can be seen as SIC4. 13
https://cdd.iec.ch/cdd/iec61360/iec61360.nsf
2.3 Data Integration