A more expressive format is offered as EDOAL.
A format for ontology alignment
The Alignment API use a general Alignment format. Its
goal is to be able to express an alignment in a consensual
format. It can then be manipulated by various tools which will use
it as input for further alignment methods, transform it into axioms
or transformations or compare different alignments.
This is a first format that could be extended for accomodating
further needs. The Alignment API offers the Expressive and
Declarative Ontology Alignment Language (EDOAL) for more
elaborate uses.
We describe below its source descriptions, its specifications and
some implementations.
Specifications
The Alignment format was initially described as an XML format. It was
given a DTD. It has since been transformed into an RDF format and
given a corresponding OWL ontology. These are currently obsolete due
to the introduction of the EDOAL format.
The namespace used by these formats is http://knowledgeweb.semanticweb.org/heterogeneity/alignment#.
Format description
Alignment element
The Alignment element describes a particular alignment. Its
attributes are the following:
- xml
- (value: "yes"/"no") indicates if the alignment can be
read as an XML file compliant with the DTD;
- level
- (values: "0", "1", "2EDOAL") the level of
alignment, characterising its type;
- type
- (values:
"11"/"1?"/"1+"/"1*"/"?1"/"??"/"?+"/"?*"/"+1"/"+?"/"++"/"+*"/"*1"/"*?"/"?+"/"**";
default "11") the type or arity of alignment. Usual notations are 1:1, 1:m, n:1 or n:m. We prefer to note if the mapping is injective, surjective and total or partial on both side.
We then end up with more alignment arities (noted with, 1 for injective and total, ? for injective, + for total and * for none and each sign concerning one mapping and its converse);
- onto1
- (value: Ontology) the first aligned ontology;
- onto2
- (value: Ontology) the second aligned ontology;
- map
- (value: Cell) a correspondance between
entities of the ontologies.
<?xml version='1.0' encoding='utf-8' standalone='no'?>
<rdf:RDF xmlns='http://knowledgeweb.semanticweb.org/heterogeneity/alignment#'
xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:xsd='http://www.w3.org/2001/XMLSchema#'
xmlns:align='http://knowledgeweb.semanticweb.org/heterogeneity/alignment#'>
<Alignment>
<xml>yes</xml>
<level>0</level>
<type>**</type>
<align:method>fr.inrialpes.exmo.align.impl.method.StringDistAlignment</align:method>
<align:time>7</align:time>
<onto1>... </onto1>
<onto2>... </onto2>
...
</Alignment>
</rdf:RDF>
Ontology element
Ontology elements provide information concerning the matched
ontologies. It contains three attributes:
- rdf:about
- contains the URI identifying the ontology;
- location
- contains the URL corresponding to a location
where the ontology may be found;
- formalism
- describes the language in which the ontology is
expressed through its name and URI.
<Ontology rdf:about="http://www.example.org/ontology2">
<location>file:examples/rdf/onto2.owl</location>
<formalism>
<Formalism align:name="OWL1.0" align:uri="http://www.w3.org/2002/07/owl#"/>
</formalism>
</Ontology>
A lighter form of the onto1 and onto2 values is
still correctly parsed but its use is discouraged.
Cell element
In first approximation, an alignment is a set of pairs of entities
from each ontology. Each such pair, called a correspondence, is
identified by the Cell element in alignments. A cell has the following attributes:
- rdf:about
- (value: URI; optional) an identifier for the cell;
- entity1
- (value: URI or edoal:Expression) the first aligned ontology entity;
- entity2
- (value: URI or edoal:Expression) the second
aligned ontology entity;
- relation
- (value: String; default: =; see below) the
relation holding between the two entities. It is not restricted to
the equivalence relation, but can be more sophisticated (see below);
- measure
- (value: float between 0. and 1., default: 1.) the confidence
that the relation holds between the first and
the second entity. Since many matching methods compute a strength
of the relation between entities, this strength can be provided as
a normalised measure. The measure should belong to an ordered set M including a maximum
element ⊤ and a minimum element ⊥. Currently, we restrict
this value to be a float value between 0. and 1.. If found useful,
this could be generalised into any lattice domain.
denotes the confidence held in this correspondence.
<map>
<Cell>
<entity1 rdf:resource='http://www.example.org/ontology1#reviewedarticle'/>
<entity2 rdf:resource='http://www.example.org/ontology2#journalarticle'/>
<relation>fr.inrialpes.exmo.align.impl.rel.EquivRelation</relation>
<measure rdf:datatype='http://www.w3.org/2001/XMLSchema#float'>0.4666666666666667</measure>
</Cell>
</map>
<map>
<Cell rdf:about="#veryImportantCell">
<entity1 rdf:resource='http://www.example.org/ontology1#journalarticle'/>
<entity2 rdf:resource='http://www.example.org/ontology2#journalarticle'/>
<relation>=</relation>
<measure rdf:datatype='http://www.w3.org/2001/XMLSchema#float'>1.0</measure>
</Cell>
</map>
Relation element
The relation element only contains the name identifying a relation
between ontology entities. This relation may be given:
- through a symbol: > (subsumes), < (is subsumed),
= (equivalent), % (incompatible), HasInstance, InstanceOf.
- through a fully qualified classname of the relation
implementation. If this class is available
under the Java environment, then the relation will be an instance
of this class.
Hence,
<relation>=</relation>
is equivalent to:
<relation>fr.inrialpes.exmo.align.impl.rel.EquivRelation</relation>
Metadata (extensions)
So far, alignments contain information about:
- the kind of alignment it is (1:1 or n:m for instance);
- the algorithm that provided it (or if it has been provided by hand);
- the language level used in the alignment (level 0 for the first example, level 2Horn for the second one);
- the confidence value in each correspondence.
The format as implemented here supports extensions both on
Alignments and on Cells. Extensions are additional string-valued
qualified attributes added to cell and alignments. They will be
preserved through the implementation. This extensions allows for
adding metadata in the alignment.
These attributes must belong to a different namespace than the
Alignment format namespace. Otherwise, errors will be raised.
Other valuable information that may be added to the alignment format are:
- the parameters passed to the generating algorithm;
- the properties satisfied by the correspondences (and their proof if necessary);
- the certificate from an issuing source;
- the limitations of the use of the alignment;
- the arguments in favour or against a correspondence, etc.
Many standard extensions have already been defined and are
documented.
Levels
In order to be able to evolve, the Alignment format is provided on
several levels, which depend on more elaborate alignment definitions.
So, far here are the identified levels:
- 0
- is reserved to alignments in which matched entities
are identified by URIs. This corresponds to the alignment
presented here.
- 1
- was intended to alignments in which correspondences
match sets of entities identified by URIs. This has never been
used.
- 2
- is used for more structured entities that may be
represented in RDF/XML. It is necessary to further identify the
structure of entities, hence advised to use a qualified level
name such as 2EDOAL. EDOAL mandates
level 2 alignments.
JAVA implementation
The Alignment API implements this format.
In particular it provides tools for:
- Outputing the RDF/XML format from the API, through
the RDFRendererVisitor renderer;
- Parsing the RDF/XML format into the API, through
the AlignmentParser parser.
The AlignmentParser is itself made of an XMLParser
based on SAX and an RDFParser based on Jena. They are tried
in a row starting from the XMLParser.
There is a command that parses an alignment and
displays it ($CWD is the directory where you are):
$ java -jar lib/procalign file://$CWD/rdf/onto1.owl file://$CWD/rdf/onto2.owl
https://moex.gitlabpages.inria.fr/alignapi/format.html