Abstract:
Ontology mediation is the process of establishing a common ground for interoperability
between domain ontologies. Ontology mapping is the task of identifying concept and
attribute correspondences between ontologies through a matching process. Ontology
mediation and mapping enable ontologists to borrow and reuse rich schema definitions
from existing domain ontologies that have already been developed by other ontologists.
For example, a white wine distributor could maintain a white wine ontology that only has
white wine concepts. This distributor may then decide at some point in the future to
include other wine classifications as well in his current ontology. Instead of creating red
wine or desert wine concepts in his existing ontology, the distributor could just borrow
these concepts from existing red wine and desert wine ontologies. As such ontology
mapping becomes necessary.
The practice of matching ontology schemas today is one that is labor-intensive. Although
semi-automated systems have been introduced, they are based on syntactic matching
algorithms which do not produce reliable results. Thus my thesis statement is that the
hybrid approach i.e., Semantic Relatedness Score (SRS), which combines both semantic
and syntactic matching algorithms, provides better results in terms of greater reliability
and precision when compared to pure syntactic matching algorithms.
This research validates that SRS provides higher precision and relevance compared to
syntactic matching techniques that have been used previously. SRS was developed
through the process of rigorously testing thirteen well established matching algorithms
and choosing a composite measure of the best combination of five out of those thirteen
measures. This thesis also provides an end-to-end approach by providing a framework,
process methodology and architecture for the process of ontology mediation.
Since implementing a fully automated system without any human intervention would not
be a realistic goal, a semi-automated approach is undertaken in this thesis. In this
approach, an ontologist is assisted by a mapping system which selects the best candidates
to be matched from the source and target ontology using SRS. The goal was not only to
reduce the workload of the ontologist, but also provide results that are reliable. Literature
survey on current ontology mediation research initiatives such as InfoSleuth, XMapper,
ONION, FOAM, FCA-Merge, KRAFT, CHIMERA, PROMPT and OBSERVER, among
others, revealed that the state-of-art of ontology mediation is to a large extent based on
mainly syntactic schema matching that supported binary schema matches (1:1) only.
A generic solution for schema matching based on SRS is presented in this thesis to
overcome these limitations. A similarity matrix for concept similarity measures is
introduced based on several cognitive and quantitative techniques such as computational
linguistics, Latent Semantic Analysis (LSA), distance vectors and lexical databases
(WordNet). The six-part matching algorithm is used to analyze RDF, OWL and XML
schemas and to provide a similarity scores which are then used to populate a similarity
matrix. The contribution here is twofold. Firstly, this approach gives a composite
similarity metric and also supports complex mappings (1:n, 1:m, m:1 and n:m). Secondly,
it provides higher relevance, reliability and precision.
The validation of this approach is demonstrated by comparing SRS results with that of
human domain experts. Empirical evidence provided in this document clearly shows that
the hybrid method resulted in a higher correlation, better relevance and more reliable
results than purely syntactic matching systems. Predefined Semantic Web Rule Language
(SWRL) rules are also introduced to concatenate attributes, discover new relations and
enforce the assertion box (ABox) instances.
Reasoning for consistency, coherence, ontology classification and inference measures are
also introduced. An actual implementation of this framework and process methodology
for the mapping of security policy ontologies (SPRO) is provided as a case study.
Another case study on achieving interoperability for e-government services with SWRL
rules is also presented. Both SRS and SWRL rules are highlighted in this document as
being complementary measures for the process of semantic bridging. Several tools were
used for a proof-of-concept for the implementation of the methodology, including
Protégé, Racer Pro, Rice and PROMPT.