Please refer to theerrata for this document, which may include some normative corrections.
See alsotranslations.
Copyright © 2012W3C® (MIT,ERCIM,Keio), All Rights Reserved. W3Cliability,trademark anddocument use rules apply.
This document describes R2RML, a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author's choice. R2RML mappings are themselves RDF graphs and written down in Turtle syntax. R2RML enables different types of mapping implementations. Processors could, for example, offer a virtual SPARQL endpoint over the mapped relational data, or generate RDF dumps, or offer a Linked Data interface.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in theW3C technical reports index at http://www.w3.org/TR/.
This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as aW3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document was published by theRDB2RDF Working Group. Comments on this document should be sent topublic-rdb2rdf-comments@w3.org, a mailing list with apublic archive. The following related documents have been made available:
This document was produced by a group operating under the5 February 2004 W3C Patent Policy. W3C maintains apublic list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes containsEssential Claim(s) must disclose the information in accordance withsection 6 of the W3C Patent Policy.
rr:parentTriplesMap
,rr:joinCondition
,rr:child
andrr:parent
)This specification describes R2RML, a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author's choice.
This specification has a companion that definesa direct mapping from relational databases to RDF [DM]. In the direct mapping of a database, the structure of the resulting RDF graph directly reflects the structure of the database, the target RDF vocabulary directly reflects the names of database schema elements, and neither structure nor target vocabulary can be changed. With R2RML on the other hand, a mapping author can define highly customized views over the relational data.
Every R2RML mapping is tailored to a specific database schema and target vocabulary. The input to an R2RML mapping is a relational database that conforms to that schema. The output is anRDF dataset [SPARQL], as defined in SPARQL, that uses predicates and types from the target vocabulary. The mapping is conceptual; R2RML processors are free to materialize the output data, or to offer virtual access through an interface that queries the underlying database, or to offer any other means of providing access to the output RDF dataset.
R2RML mappings are themselves expressed as RDF graphs and written down inTurtle syntax [TURTLE].
The intended audience of this specification is implementors of software that generates or processes R2RML mapping documents, as well as mapping authors looking for a reference to the R2RML language constructs. The document uses concepts fromRDF Concepts and Abstract Syntax [RDF] and from theSQL language specifications [SQL1][SQL2]. A reader's familiarity with the contents of these documents, as well as with the Turtle syntax, is assumed.
The R2RML language is designed to meet the use cases and requirements identified inUse Cases and Requirements for Mapping Relational Databases to RDF [UCNR].
In this document, examples assume the following namespace prefix bindings unless otherwise stated:
Prefix | IRI |
---|---|
rr: | http://www.w3.org/ns/r2rml# |
rdf: | http://www.w3.org/1999/02/22-rdf-syntax-ns# |
rdfs: | http://www.w3.org/2000/01/rdf-schema# |
xsd: | http://www.w3.org/2001/XMLSchema# |
ex: | http://example.com/ns# |
Throughout the document, boxes containing Turtle markup and SQL data will appear. These boxes are color-coded. Gray boxes contain RDFS definitions of R2RML vocabulary terms:
# This box contains RDFS definitions of R2RML vocabulary terms
Yellow boxes contain example fragments of R2RML mappings in Turtle syntax:
# This box contains example R2RML mappings
Blue tables contain example input into an R2RML mapping:
IDINTEGER PRIMARY KEY | DESCVARCHAR(100) |
---|---|
1 | This is an example input table. |
2 | The table name is EXAMPLE. |
3 | It has six rows. |
4 | It has two columns, ID and DESC. |
5 | ID is the table's primary key and of type INTEGER. |
6 | DESC is of type VARCHAR(100) |
Green boxes contain example output:
# This box contains example output RDF triples or fragments
This section gives a brief overview of the R2RML mapping language, followed by a simple example relational database with an R2RML mapping document and its output RDF. Further R2RML examples can be found in theR2RML and Direct Mapping Test Cases [TC].
AnR2RML mapping refers tological tables to retrieve data from theinput database. A logical table can be one of the following:
Each logical table is mapped to RDF using atriples map. The triples map is a rule that maps eachrow in the logical table to a number ofRDF triples. The rule has two main parts:
Triples are produced by combining the subject map with a predicate map and object map, and applying these three to eachlogical table row. For example, the complete rule for generating a set of triples might be:
http://data.example.com/employee/{empno}
is used to generate subjectIRIs from theempno
column.ex:name
is used.ename
column is used to produce anRDF literal.By default, allRDF triples are in thedefault graph of theoutput dataset. A triples map can containgraph maps that place some or all of the triples intonamed graphs instead.
The following example database consists of two tables,EMP
andDEPT
, with one row each:
EMPNOINTEGER PRIMARY KEY | ENAMEVARCHAR(100) | JOBVARCHAR(20) | DEPTNOINTEGER REFERENCES DEPT (DEPTNO) |
---|---|---|---|
7369 | SMITH | CLERK | 10 |
DEPTNOINTEGER PRIMARY KEY | DNAMEVARCHAR(30) | LOCVARCHAR(100) |
---|---|---|
10 | APPSERVER | NEW YORK |
The desired RDF triples to be produced from this database are as follows:
<http://data.example.com/employee/7369> rdf:type ex:Employee.<http://data.example.com/employee/7369> ex:name "SMITH".<http://data.example.com/employee/7369> ex:department <http://data.example.com/department/10>.<http://data.example.com/department/10> rdf:type ex:Department.<http://data.example.com/department/10> ex:name "APPSERVER".<http://data.example.com/department/10> ex:location "NEW YORK".<http://data.example.com/department/10> ex:staff 1.
Note in particular:
ex:Employee
,ex:location
etc.);ex:staff
property has the total number of staff of a department; this value is not stored directly in the database but has to be computed.ex:department
property relates an employee to their department, using the identifiers of both entities;The following partial R2RML mapping document will produce the desired triples from theEMP
table (except theex:department
triple, which will be added later):
@prefix rr: <http://www.w3.org/ns/r2rml#>.@prefix ex: <http://example.com/ns#>.<#TriplesMap1> rr:logicalTable [ rr:tableName "EMP" ]; rr:subjectMap [ rr:template "http://data.example.com/employee/{EMPNO}"; rr:class ex:Employee; ]; rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column "ENAME" ]; ].
<http://data.example.com/employee/7369> rdf:type ex:Employee.<http://data.example.com/employee/7369> ex:name "SMITH".
Next, theDEPT
table needs to be mapped. Instead of using the table directly as the basis for that mapping, an “R2RML view” will be defined based on a SQL query. This allows computation of the staff number. (Alternatively, one could define this view directly in the database.)
<#DeptTableView> rr:sqlQuery """SELECT DEPTNO, DNAME, LOC, (SELECT COUNT(*) FROM EMP WHERE EMP.DEPTNO=DEPT.DEPTNO) AS STAFFFROM DEPT;""".
The definition of a triples map that generates the desiredDEPT
triples based on this R2RML view follows.
<#TriplesMap2> rr:logicalTable <#DeptTableView>; rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}"; rr:class ex:Department; ]; rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column "DNAME" ]; ]; rr:predicateObjectMap [ rr:predicate ex:location; rr:objectMap [ rr:column "LOC" ]; ]; rr:predicateObjectMap [ rr:predicate ex:staff; rr:objectMap [ rr:column "STAFF" ]; ].
<http://data.example.com/department/10> rdf:type ex:Department.<http://data.example.com/department/10> ex:name "APPSERVER".<http://data.example.com/department/10> ex:location "NEW YORK".<http://data.example.com/department/10> ex:staff 1.
To complete the mapping document, theex:department
triples need to be generated. Their subjects come from the first triples map (<#TriplesMap1>
), the objects come from the second triples map (<#TriplesMap2>
).
This can be achieved by adding anotherrr:predicateObjectMap
to<#TriplesMap1>
. This one uses the other triples map,<#TriplesMap2>
, as aparent triples map:
<#TriplesMap1> rr:predicateObjectMap [ rr:predicate ex:department; rr:objectMap [ rr:parentTriplesMap <#TriplesMap2>; rr:joinCondition [ rr:child "DEPTNO"; rr:parent "DEPTNO"; ]; ]; ].
This performs a join between theEMP
table and the R2RML view, on theDEPTNO
columns. The objects will be generated from the subject map of the parent triples map, yielding the desired triple:
<http://data.example.com/employee/7369> ex:department <http://data.example.com/department/10>.
This completes the R2RML mapping document. An R2RML processor will generate the triples listed above from this mapping document.
The following example will assume that a many-to-many relationship exists between the extended versions ofEMP
table and theDEPT
table shown below. This many-to-many relationship is captured by the content of theEMP2DEPT
table. The database consisting of theEMP
,DEPT
, andEMP2DEPT
tables are shown below:
EMPNOINTEGER PRIMARY KEY | ENAMEVARCHAR(100) | JOBVARCHAR(20) |
---|---|---|
7369 | SMITH | CLERK |
7369 | SMITH | NIGHTGUARD |
7400 | JONES | ENGINEER |
DEPTNOINTEGER PRIMARY KEY | DNAMEVARCHAR(30) | LOCVARCHAR(100) |
---|---|---|
10 | APPSERVER | NEW YORK |
20 | RESEARCH | BOSTON |
EMPNOINTEGER REFERENCES EMP (EMPNO) | DEPTNOINTEGER REFERENCES DEPT (DEPTNO) |
---|---|
7369 | 10 |
7369 | 20 |
7400 | 10 |
<http://data.example.com/employee=7369/department=10> ex:employee <http://data.example.com/employee/7369> ; ex:department <http://data.example.com/department/10> .<http://data.example.com/employee=7369/department=20> ex:employee <http://data.example.com/employee/7369> ; ex:department <http://data.example.com/department/20> .<http://data.example.com/employee=7400/department=10> ex:employee <http://data.example.com/employee/7400> ; ex:department <http://data.example.com/department/10> .
The following R2RML mapping will produce the desired triples listed above:
<#TriplesMap3> rr:logicalTable [ rr:tableName "EMP2DEPT" ]; rr:subjectMap [ rr:template "http://data.example.com/employee={EMPNO}/department={DEPTNO}" ]; rr:predicateObjectMap [ rr:predicate ex:employee; rr:objectMap [ rr:template "http://data.example.com/employee/{EMPNO}" ]; ]; rr:predicateObjectMap [ rr:predicate ex:department; rr:objectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ]; ].
However, if one doesnot require that the subjects in the desired output uniquely identify the rows in theEMP2DEPT
table, the desired output may look as follows:
<http://data.example.com/employee/7369> ex:department <http://data.example.com/department/10> ; ex:department <http://data.example.com/department/20> .<http://data.example.com/employee/7400> ex:department <http://data.example.com/department/10>.
The following R2RML mapping will produce the desired triples:
<#TriplesMap3> rr:logicalTable [ rr:tableName "EMP2DEPT" ]; rr:subjectMap [ rr:template "http://data.example.com/employee/{EMPNO}"; ]; rr:predicateObjectMap [ rr:predicate ex:department; rr:objectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ]; ].
Sometimes, database columns contain codes that need to be translated into IRIs, but a direct syntactic translation usingstring templates is not possible. For example, consider aJOB
column in theEMP
table with the following possible values, and IRIs corresponding to those database values in the RDF output:
Value | Corresponding RDF IRI |
---|---|
CLERK | http://data.example.com/roles/general-office |
NIGHTGUARD | http://data.example.com/roles/security |
ENGINEER | http://data.example.com/roles/engineering |
The IRIs are not found in the original database and therefore the mapping from database codes to IRIs has to be specified in the R2RML mapping. Such translations can be achieved using an “R2RML view”. The view is defined based on a SQL query that computes the IRI based on the database value. SQL'sCASE
statement is convenient for this purpose. (Alternatively, one could define this view directly in the database.)
<#TriplesMap1> rr:logicalTable [ rr:sqlQuery """ SELECT EMP.*, (CASE JOB WHEN 'CLERK' THEN 'general-office' WHEN 'NIGHTGUARD' THEN 'security' WHEN 'ENGINEER' THEN 'engineering' END) ROLE FROM EMP """ ]; rr:subjectMap [ rr:template "http://data.example.com/employee/{EMPNO}"; ]; rr:predicateObjectMap [ rr:predicate ex:role; rr:objectMap [ rr:template "http://data.example.com/roles/{ROLE}" ]; ].
With theexample input database, this mapping would yield the following triple:
<http://data.example.com/employee/7369> ex:role <http://data.example.com/roles/general-office>.
As well as sections marked as non-normative in the section heading, all diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key wordsmust,must not,required,should,should not,recommended,may, andoptional in this specification are to be interpreted as described inRFC 2119 [RFC2119].
This specification describes conformance criteria for:
A collection of test cases for R2RML processors and R2RML data validators is available in theR2RML and Direct Mapping Test Cases [TC].
This specification defines R2RML for databases that conform toCore SQL 2008, as defined inISO/IEC 9075-1:2008 [SQL1] andISO/IEC 9075-2:2008 [SQL2]. Processors and mappings may have to deviate from the R2RML specification in order to support databases that do not conform to this version of SQL.
Where SQL queries are embedded into R2RML mappings,SQL version identifiers can be used to indicate the specific version of SQL that is being used.
AnR2RML mapping defines a mapping from a relational database to RDF. It is a structure that consists of one or moretriples maps.
The input to an R2RML mapping is called theinput database.
AnR2RML processor is a system that, given anR2RML mapping and aninput database, provides access to theoutput dataset.
There are no constraints on the method of access to the output dataset provided by a conforming R2RML processor. An R2RML processorMAY materialize the output dataset into a file, or offer virtual access through an interface that queries the input database, or offer any other means of providing access to the output dataset.
AnR2RML processor also has access to an execution environment consisting of:
TheSQL connection is used by the R2RML processor to evaluate SQL queries against the input database. ItMUST be established with sufficient privileges for read access to all base tables and views that are referenced in the R2RML mapping. ItMUST be configured with adefault catalog anddefault schema that will be used when tables and views are accessed without an explicit catalog or schema reference.
How the SQL connection is established, or how users are authenticated against the database, is outside of the scope of this document.
Thebase IRIMUST be a validIRI. ItSHOULD NOT contain question mark (“?
”) or hash (“#
”) characters andSHOULD end in a slash (“/
”) character.
To obtain an absolute IRI from a relative IRI, theterm generation rules of R2RML use simple string concatenation, rather than the more complex algorithm for resolution of relative URIs defined inSection 5.2 of [RFC3986]. This ensures that the original database value can be reconstructed from the generated absolute IRI. Both algorithms are equivalent if all of the following are true:
.
” or “..
” path segments.AnR2RML data validator is a system that takes as its input anR2RML mapping, abase IRI, and aSQL connection to aninput database, and checks for the presence ofdata errors. When checking the input database, a data validatorMUST report any data errors that are raised in the process of generating the output dataset.
AnR2RML processorMAY include anR2RML data validator, but this is not required.
AnR2RML mapping is represented as anRDF graph. In other words, RDF is used not just as the target data model of the mapping, but also as a formalism for representing the R2RML mapping itself.
AnRDF graph that represents anR2RML mapping is called anR2RML mapping graph.
TheR2RML vocabulary is the set ofIRIs defined in this specification that start with therr:
namespace IRI:
http://www.w3.org/ns/r2rml#
rr:
namespace IRI, but are not defined in theR2RML vocabulary.rdfs:label
,rdfs:comment
and similar properties.TheR2RML vocabulary also includes the followingR2RML classes:
rr:TriplesMap
is the class oftriples maps.rr:LogicalTable
is the class oflogical tables. It has two subclasses:rr:R2RMLView
is the class ofR2RML views.rr:BaseTableOrView
is the class ofSQL base tables or views.rr:TermMap
is the class ofterm maps. It has four subclasses:rr:SubjectMap
is the class ofsubject maps.rr:PredicateMap
is the class ofpredicate maps.rr:ObjectMap
is the class ofobject maps.rr:GraphMap
is the class ofgraph maps.rr:PredicateObjectMap
is the class ofpredicate-object maps.rr:RefObjectMap
is the class ofreferencing object maps.rr:Join
is the class ofjoin conditions.The members of these classes are collectively calledmapping components.
Many of these classes differ only in capitalization from properties in theR2RML vocabulary.
Explicit typing of the resources in a mapping graph withR2RML classes isOPTIONAL and has no effect on the behaviour of anR2RML processor. Themapping component represented by any given resource in a mapping graph is defined by the presence or absence of certain properties, as defined throughout this specification. A resourceSHOULD NOT be typed as an R2RML class if it does not meet the definition of that class.
AnR2RML mapping document is any document written in theTurtle [TURTLE] RDF syntax that encodes anR2RML mapping graph.
The media type forR2RML mapping documents is the same as for Turtle documents in general:text/turtle
. The content encoding of Turtle content is always UTF-8 and thecharset
parameter on the media typeSHOULD always be used:text/turtle;charset=utf-8
. The file extension.ttl
SHOULD be used.
A conformingR2RML processorSHOULD acceptR2RML mapping documents in Turtle syntax. ItMAY acceptR2RML mapping graphs encoded in other RDF syntaxes.
Adata error is a condition of the data in theinput database that would lead to the generation of an invalidRDF term. The following conditions give rise to data errors:
rr:IRI
results in thegeneration of an invalidIRI.When providing access to theoutput dataset, anR2RML processorMUST abort any operation that requires inspecting or returning anRDF term whose generation would give rise to adata error, and report an error to the agent invoking the operation. A conformingR2RML processorMAY, however, allow other operations that do not require inspecting or returning theseRDF terms, and thusMAY provide partial access to anoutput dataset that contains data errors. Nevertheless, anR2RML processorSHOULD report data errors as early as possible.
The presence ofdata errors does not make anR2RML mapping non-conforming.
Data errors cannot generally be detected by analyzing the table schema of the database, but only by scanning the data in the tables. For large and rapidly changing databases, this can be impractical. Therefore,R2RML processors are allowed to answer queries that do not “touch” a data error, and the behavior of such operations is well-defined. For the same reason, the conformance ofR2RML mappings is defined without regard for the presence of data errors.
R2RML data validators can be used to explicitly scan a database for data errors.
AnR2RML processorMAY include anR2RML default mapping generator. This is a facility that introspects the schema of theinput database and generates anR2RML mapping, possibly in the form of anR2RML mapping document, intended for further customization by a mapping author. Such a mapping is known as adefault mapping.
Thedefault mappingSHOULD be such that its output is theDirect Graph [DM] corresponding to theinput database.
Duplicate row preservation: For tables without a primary key, theDirect Graph requires that a freshblank node is created for each row. This ensures that duplicate rows in such tables are preserved. This requirement is relaxed for R2RMLdefault mappings: TheyMAY reuse the same blank node for multiple duplicate rows. This behaviour does not preserve duplicate rows.R2RML default mapping generators that provide default mappings based on the Direct GraphMUST document whether the generateddefault mappingpreserves duplicate rows or not.
Alogical table is a tabular SQL query result that is to be mapped toRDF triples. A logical table is either
Every logical table has aneffective SQL query that, if executed over theSQL connection, produces as its result the contents of the logical table.
Alogical table row is a row in alogical table.
Acolumn name is the name of a column of alogical table. A column nameMUST be a validSQL identifier. Column names do not include any qualifying table, view or schema names.
ASQL identifier is the name of a SQL object, such as a column, table, view, schema, or catalog. A SQL identifierMUST match the<identifier>
production in [SQL2]. When comparing identifiers for equality, the comparison rules of [SQL2]MUST be used.
deptno
and"deptno"
are not equivalent (delimited identifiers that are not all-upper-case are not equivalent to any undelimited identifiers).DEPTNO
and"DEPTNO"
are equivalent (all-upper-case delimited and undelimited identifiers are equivalent).deptno
,dept_no
,"dept_no"
,"Department Number"
,"Identifier ""with quotes"""
.[] rr:column "deptno".[] rr:column "dept_no".[] rr:column "\"dept_no\"".[] rr:column "\"Department Number\"".[] rr:column "\"Identifier \"\"with quotes\"\"\"".These rules are forCore SQL 2008. SeeSection 3,Conformance regarding databases that do not conform to this version of SQL.
rr:tableName
)ASQL base table or view is alogical table containing SQL data from a base table or view in theinput database. A SQL base table or view is represented by a resource that has exactly onerr:tableName
property.
The value ofrr:tableName
specifies thetable or view name of the base table or view. Its valueMUST be a validschema-qualified name that names an existing base table or view in theinput database.
Aschema-qualified name is a sequence of one, two or three validSQL identifiers, separated by the dot character (“.
”). The three identifiers name, respectively, a catalog, a schema, and a table or view. If no catalog or schema is specified, then thedefault catalog anddefault schema of theSQL connection are assumed.
Theeffective SQL query of aSQL base table or view is:
SELECT * FROM{table}
with{table}
replaced with thetable or view name.
The following example shows a logical table specified using a schema-qualified table name.
[] rr:tableName "SCOTT.DEPT".
The following example shows a logical table specified using an unqualified table name. The SQL connection's default schema will be used.
[] rr:tableName "DEPT".
rr:sqlQuery
,rr:sqlVersion
)AnR2RML view is alogical table whose contents are the result of executing a SQL query against theinput database. It is represented by a resource that has exactly onerr:sqlQuery
property, whose value is aliteral with alexical form that is a validSQL query.
R2RML mappings sometimes require data transformation, computation, or filtering before generating triples from the database. This can be achieved by defining a SQL view in theinput database and referring to it withrr:tableName
. However, this approach may sometimes not be practical for lack of database privileges or other reasons.R2RML views achieve the same effect without requiring changes to the input database.
Note that unlike “real” SQL views, an R2RML view can not be used as an input table in further SQL queries.
ASQL query is aSELECT
query in the SQL language that can be executed over theinput database. The stringMUST conform to the production<direct select statement: multiple rows>
in [SQL2] with anOPTIONAL trailing semicolon character andOPTIONAL surrounding white space (excluding comments) as defined in [TURTLE]. ItMUST be valid to execute over theSQL connection. The result of the query executionMUST NOT have duplicatecolumn names. Any columns in theSELECT
list derived by projecting an expressionSHOULD be named, because otherwise they cannot be reliably referenced in the rest of the mapping.
Database objects referenced in the SQL queryMAY be qualified with a catalog or schema name. For any database objects referenced without an explicit catalog name or schema name, thedefault catalog anddefault schema of theSQL connection are assumed.
For example, the followingSELECT
query isnot a valid R2RMLSQL query because the result contains a duplicate column nameDEPTNO
:
SELECT EMP.DEPTNO, 1 AS DEPTNO FROM EMP;
As a further example, the followingSELECT
querySHOULD NOT be used, because it contains an unnamed column derived from aCOUNT
expression:
SELECT DEPTNO, COUNT(EMPNO) FROM EMP GROUP BY DEPTNO;
AnR2RML viewMAY have one or moreSQL version identifiers. TheyMUST be validIRIs and are represented as values of therr:sqlVersion
property. The followingSQL version identifier indicates that the SQL query conforms to Core SQL 2008:
http://www.w3.org/ns/r2rml#SQL2008
The absence of aSQL version identifier indicates that no claim to Core SQL 2008 conformance is made.
No further identifiers besidesrr:SQL2008
are defined in this specification. The RDB2RDF Working Group intends to maintain a non-normativelist of identifiers for other SQL versions [SQLIRIS].
Theeffective SQL query of anR2RML view is the value of itsrr:sqlQuery
property.
The following example shows a logical table specified as an R2RML view conforming to Core SQL 2008.
[] rr:sqlQuery """ Select ('Department' || DEPTNO) AS DEPTID , DEPTNO , DNAME , LOC from SCOTT.DEPT """; rr:sqlVersion rr:SQL2008.
Atriples map specifies a rule for translating each row of alogical table to zero or moreRDF triples.
The RDF triples generated from one row in the logical table all share the same subject.
A triples map is represented by a resource that references the following other resources:
rr:logicalTable
property. Its value is alogical table that specifies a SQL query result to be mapped to triples.rr:subjectMap
property, whose valueMUST be the subject map, orrr:subject
.rr:predicateObjectMap
properties, whose valuesMUST bepredicate-object maps. They specify pairs of predicate maps and object maps that, together with the subjects generated by the subject map, may form one or moreRDF triples for each row.Thereferenced columns of allterm maps of a triples map (subject map, predicate maps, object maps, graph maps)MUST becolumn names that exist in the term map'slogical table.
The following example shows atriples map including its logical table, subject map, and two predicate-object maps.
[] rr:logicalTable [ rr:tableName "DEPT" ]; rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ]; rr:predicateObjectMap [ rr:predicate ex:name; rr:objectMap [ rr:column "DNAME" ]; ]; rr:predicateObjectMap [ rr:predicate ex:location; rr:objectMap [ rr:column "LOC" ]; ].
Asubject map is aterm map. It specifies a rule for generating the subjects of theRDF triples generated by atriples map.
rr:class
)Asubject mapMAY have one or moreclass IRIs. They are represented by therr:class
property. The values of therr:class
propertyMUST beIRIs. For eachRDF term generated by the subject map,RDF triples with predicaterdf:type
and the class IRI as object will be generated.
Mappings where the class IRI is not constant, but needs to be computed based on the contents of the database, can be achieved by defining apredicate-object map with predicaterdf:type
and a non-constantobject map.
In the following example, the generated subject will be asserted as an instance of theex:Employee
class.
[] rr:template "http://data.example.com/employee/{EMPNO}"; rr:class ex:Employee.
Using the exampleEMP
table, the following RDF triple will be generated:
<http://data.example.com/emp/7369> rdf:type ex:Employee.
Apredicate-object map is a function that creates one or more predicate-object pairs for eachlogical table row of alogical table. It is used in conjunction with asubject map to generateRDF triples in atriples map.
Apredicate-object map is represented by a resource that references the following other resources:
One or morepredicate maps. Each of them may be specified in one of two ways:
rr:predicateMap
property, whose valueMUST be apredicate map, orrr:predicate
.One or moreobject maps orreferencing object maps. Each of them may be specified in one of two ways:
rr:objectMap
property, whose valueMUST be either anobject map, or areferencing object map.rr:object
.Apredicate map is aterm map.
Anobject map is aterm map.
AnRDF term is either anIRI, or ablank node, or aliteral.
Aterm map is a function that generates anRDF term from alogical table row. The result of that function is known as the term map'sgenerated RDF term.
Term maps are used to generate the subjects, predicates and objects of theRDF triples that are generated by atriples map. Consequently, there are several kinds ofterm maps, depending on where in the mapping they occur:subject maps,predicate maps,object maps andgraph maps.
Aterm mapMUST be exactly one of the following:
Thereferenced columns of aterm map are the set ofcolumn names referenced in the term map and depend on the type of term map.
rr:constant
)Aconstant-valued term map is aterm map that ignores thelogical table row and always generates the same RDF term. A constant-valued term map is represented by a resource that has exactly onerr:constant
property.
Theconstant value of aconstant-valued term map is the RDF term that is the value of itsrr:constant
property.
If theconstant-valued term map is asubject map,predicate map orgraph map, then itsconstant valueMUST be anIRI.
If theconstant-valued term map is anobject map, then itsconstant valueMUST be anIRI orliteral.
Thereferenced columns of aconstant-valued term map is the empty set.
Constant-valued term maps can be expressed more concisely using theconstant shortcut propertiesrr:subject
,rr:predicate
,rr:object
andrr:graph
. Occurrences of these propertiesMUST be treated exactly as if the following triples were present in the mapping graph instead:
Triple involving constant shortcut property | Replacement triples |
---|---|
?x rr:subject?y. | ?x rr:subjectMap [ rr:constant?y ]. |
?x rr:predicate?y. | ?x rr:predicateMap [ rr:constant?y ]. |
?x rr:object?y. | ?x rr:objectMap [ rr:constant?y ]. |
?x rr:graph?y. | ?x rr:graphMap [ rr:constant?y ]. |
The following example shows apredicate-object map that uses a constant-valued term map both for its predicate and for its object.
[] rr:predicateMap [ rr:constant rdf:type ]; rr:objectMap [ rr:constant ex:Employee ].
If added to atriples map, this predicate-object map would add the following triple to all resources?x
generated by the triples map:
?x rdf:type ex:Employee.
The following example usesconstant shortcut properties and is equivalent to the example above:
[] rr:predicate rdf:type; rr:object ex:Employee.
rr:column
)Acolumn-valued term map is aterm map that is represented by a resource that has exactly onerr:column
property.
The value of therr:column
propertyMUST be a validcolumn name. Thecolumn value of the term map is the data value of that column in a givenlogical table row.
Thereferenced columns of acolumn-valued term map is the singleton set containing the value of the term map'srr:column
property.
The following example defines anobject map that generatesliterals from theDNAME
column of some logical table.
[] rr:objectMap [ rr:column "DNAME" ].
Using the sample row from theDEPT
table as a logical table row, thecolumn value of the object map would be “APPSERVER
”.
rr:template
)Atemplate-valued term map is aterm map that is represented by a resource that has exactly onerr:template
property. The value of therr:template
propertyMUST be a validstring template.
Astring template is a format string that can be used to build strings from multiple components. It can referencecolumn names by enclosing them in curly braces (“{
” and “}
”). The following syntax rules apply to valid string templates:
\
”). This also applies to curly braces within column names.\
”)MUST be escaped by preceding them with another backslash character, yielding “\\
”. This also applies to backslashes within column names.rr:IRI
(seenote below).Thetemplate value of the term map for a givenlogical table row is determined as follows:
result
be thetemplate stringresult
:value
be the data value of the column whose name is enclosed in the curly bracesvalue
isNULL
, then returnNULL
value
be thenatural RDF lexical form corresponding tovalue
rr:IRI
, then replace the pair of curly braces with anIRI-safe version ofvalue
; otherwise, replace the pair of curly braces withvalue
result
TheIRI-safe version of a string is obtained by applying the following transformation to any character that is not in theiunreserved
production in [RFC3987]:
The following table shows examples of strings and their IRI-safe versions:
String | IRI-safe version |
---|---|
42 | 42 |
Hello World! | Hello%20World%21 |
2011-08-23T22:17:00Z | 2011-08-23T22%3A17%3A00Z |
~A_17.1-2 | ~A_17.1-2 |
葉篤正 | 葉篤正 |
R2RML always performs percent-encoding when IRIs are generated from string templates. If IRIs need to be generated without percent-encoding, thenrr:column
should be used instead ofrr:template
, with anR2RML view that performs the string concatenation.
In the case of string templates that generate IRIs, any single character that is legal in an IRI, but percent-encoded in theIRI-safe version of a data value, is asafe separator. This includes in particular the elevensub-delim
characters defined in [RFC3987]:!$&'()*+,;=
Thereferenced columns of atemplate-valued term map is the set ofcolumn names enclosed in unescaped curly braces in thetemplate string.
The following example defines asubject map that generatesIRIs from theDEPTNO
column of a logical table.
[] rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ].
Using the sample row from theDEPT
table as a logical table row, thetemplate value of the subject map would be:
http://data.example.com/department/10
The following example shows how anIRI-safe template value is created:
[] rr:subjectMap [ rr:template "http://data.example.com/site/{LOC}" ].
Using the sample row from theDEPT
table as a logical table row, thetemplate value of the subject map would be:
http://data.example.com/site/NEW%20YORK
The space character is not in theiunreserved
set, and therefore percent-encoding is applied to the character, yielding “%20
”.
The following example shows the use of backslash escapes in string templates. The template will generate a fancy title such as
{{{ \o/ Hello World! \o/ }}}
from a string “Hello World!
” in theTITLE
column. By default,rr:template
generates IRIs. Since the intention here is to create a literal instead, theterm type has to be set.
[] rr:objectMap [ rr:template "\\{\\{\\{ \\\\o/ {TITLE} \\\\o/ \\}\\}\\}"; rr:termType rr:Literal;].
Note that because backslashes need to be escaped by a second backslash in the Turtle syntax [TURTLE], a double backslash is needed to escape each curly brace, and to get one literal backslash in the output one needs to write four backslashes in the template.
rr:termType
)Theterm type of acolumn-valued term map ortemplate-valued term map determines the kind ofgenerated RDF term (IRIs,blank nodes orliterals).
If the term map has an optionalrr:termType
property, then itsterm type is the value of that property. The valueMUST be an IRI andMUST be one of the following options:
rr:IRI
orrr:BlankNode
rr:IRI
rr:IRI
,rr:BlankNode
, orrr:Literal
rr:IRI
If the term map does not have arr:termType
property, then itsterm type is:
rr:Literal
, if it is anobject map and at least one of the following conditions is true:rr:language
property (and thus aspecified language tag).rr:datatype
property (and thus aspecified datatype).rr:IRI
, otherwise.Term maps with term typerr:IRI
causedata errors if the value is not a validIRI (seegenerated RDF term for details). Data values from the input database may require percent-encoding before they can be used in IRIs.Template-valued term maps are a convenient way of percent-encoding data values.
Constant-valued term maps are not considered as having aterm type, and specifyingrr:termType
on these term maps has no effect. The type of the generated RDF term is determined directly by the value ofrr:constant
: If it is an IRI, then an IRI will be generated; if it is a literal, a literal will be generated.
rr:language
)Aterm map with aterm type ofrr:Literal
MAY have aspecified language tag. It is represented by therr:language
property on a term map. If present, its valueMUST be a validlanguage tag.
A specified language tag causes generated literals to be language-tagged plain literals. In the following example, plain literals with language tag “en-us
” (U.S. English) will be generated for the data values in theDNAME
column.
[] rr:objectMap [ rr:column "DNAME"; rr:language "en-us" ].
rr:datatype
)Adatatypeable term map is aterm map with aterm type ofrr:Literal
that does not have aspecified language tag.
Datatypeable term maps may generatetyped literals. The datatype of these literals can be automatically determined based on the SQL datatype of the underlying logical table column (producing anatural RDF literal), or it can be explicitly overridden usingrr:datatype
(producing adatatype-override RDF literal).
Adatatypeable term mapMAY have arr:datatype
property. Its valueMUST be anIRI. This IRI is thespecified datatype of the term map.
A term mapMUST NOT have more than onerr:datatype
value.
A term map that is not adatatypeable term mapMUST NOT have anrr:datatype
property.
Theimplicit SQL datatype of adatatypeable term map isCHARACTER VARYING
if the term map is atemplate-valued term map; otherwise, it is the SQL datatype of the respective column in thelogical table row.
Seegenerated RDF term for further details on generating literals from term maps.
One cannot explicitly state that aplain literal withoutlanguage tag should be generated. They are the default for string columns. To generate one from a non-string column, atemplate-valued term map with a template such as"{MY_COLUMN}"
and aterm type ofrr:Literal
can be used.
The following example shows anobject map that overrides the default datatype of the logical table with an explicitly specifiedxsd:positiveInteger
type. Adatatype-override RDF literal of that datatype will be generated from whatever is in theEMPNO
column.
[] rr:objectMap [ rr:column "EMPNO"; rr:datatype xsd:positiveInteger ].
rr:inverseExpression
)Aninverse expression is astring template associated with acolumn-valued term map ortemplate-value term map. It is represented by the value of therr:inverseExpression
property. This property isOPTIONAL and thereMUST NOT be more than one for a term map.
Inverse expressions are useful for optimizingterm maps that reference derived columns inR2RML views. An inverse expression specifies an expression that allows “reversing” of agenerated RDF term and the construction of a SQL query that efficiently retrieves thelogical table row from which the term was generated. In particular, it allows the use of indexes on the underlying relational tables.
Every pair of unescaped curly braces in the inverse expression is acolumn reference in an inverse expression. The string between the bracesMUST be a validcolumn name.
Aninverse expressionMUST satisfy the following condition:
SELECT * FROM ({query}) AS tmp WHERE{expr}where
{query}
is theeffective SQL query oft, and{expr}
isinstantiation(r)NULL
,same-term(r)MUST be exactly the set of logical table rows int whosegenerated RDF term is alsog.For example, for theDEPTID
column in thelogical table used for mapping theDEPT
table inthis example mapping, an inverse expression could be defined as follows:
[] rr:column "DEPTID"; rr:inverseExpression "{DEPTNO} = SUBSTRING({DEPTID}, CHARACTER_LENGTH('Department')+1)";
This facilitates the use of an existing index on theDEPTNO
column of theDEPT table.
Aquoted and escaped data value is any SQL string that matches the<literal>
or<null specification>
productions of [SQL2]. This string can be used in a SQL query to specify a SQL data value. Examples:
27
'foo'
'foo''bar'
TRUE
DATE '2011-11-11'
NULL
rr:parentTriplesMap
,rr:joinCondition
,rr:child
andrr:parent
)Areferencing object map allows using the subjects of anothertriples map as the objects generated by apredicate-object map. Since both triples maps may be based on differentlogical tables, this may require a join between the logical tables. This is not restricted to 1:1 joins.
A referencing object map is represented by a resource that:
rr:parentTriplesMap
property, whose valueMUST be atriples map, known as the referencing object map'sparent triples map.rr:joinCondition
properties, whose valuesMUST bejoin conditions.Ajoin condition is represented by a resource that has exactly one value for each of the following two properties:
rr:child
, whose value is known as the join condition'schild column andMUST be acolumn name that exists in thelogical table of thetriples map that contains the referencing object maprr:parent
, whose value is known as the join condition'sparent column andMUST be acolumn name that exists in thelogical table of the referencing object map'sparent triples map.Thechild query of areferencing object map is theeffective SQL query of thelogical table of theterm map containing the referencing object map.
Theparent query of areferencing object map is theeffective SQL query of thelogical table of itsparent triples map.
If thechild query andparent query of areferencing object map are not identical, then the referencing object mapMUST have at least onejoin condition.
Thejoint SQL query of areferencing object map is:
SELECT * FROM ({child-query}) AS tmp
SELECT * FROM ({child-query}) AS child, ({parent-query}) AS parent WHERE child.{child-column1}=parent.{parent-column1} AND child.{child-column2}=parent.{parent-column2} AND ...where
{child-query}
is the referencing object map'schild query,{parent-query}
is itsparent query,{child-column1}
and{parent-column1}
are thechild column andparent column of its firstjoin condition, and so on. The order of the join conditions is chosen arbitrarily.Thejoint SQL query is used whengenerating RDF triples fromreferencing object maps.
The following example shows a referencing object map as part of apredicate-object map:
[] rr:predicateObjectMap [ rr:predicate ex:department; rr:objectMap [ rr:parentTriplesMap <#TriplesMap2>; rr:joinCondition [ rr:child "DEPTNO"; rr:parent "DEPTNO"; ]; ];].
If the logical table of the surrounding triples map isEMP
, and the logical table of<#TriplesMap2>
isDEPT
, this would result in a join between these two tables with the condition
EMP.DEPTNO = DEPT.DEPTNO
and the objects of the triples would be generated using the subject map of<#TriplesMap2>
.
Given the twoexample tables, and subject maps as defined in theexample mapping, this would result in a triple:
<http://data.example.com/employee/7369> ex:department <http://data.example.com/department/10>.
The following example shows areferencing object map that does not have ajoin condition. It creates two kinds of resources from theDEPT
table: departments and sites.
<#DeptTriplesMap> rr:logicalTable [ rr:tableName "DEPT" ]; rr:subjectMap [ rr:template "department/{DEPTNO}"; rr:class ex:Department; ]; rr:predicateObjectMap [ rr:predicate ex:location; rr:objectMap [ rr:parentTriplesMap <#SiteTriplesMap> ]; ].<#SiteTriplesMap> rr:logicalTable [ rr:tableName "DEPT" ]; rr:subjectMap [ rr:template "site/{LOC}"; rr:class ex:Site; ]; rr:predicateObjectMap [ rr:predicate ex:siteName; rr:objectMap [ ex:column "LOC" ]; ].
Anex:Site
resource is created for each distinct value in theLOC
column, using the<#SiteTriplesMap>
. Departments and sites are linked byex:location
triples, and the objects of these triples are specified using areferencing object map that references the sites triples map. No join condition is needed as both triples maps use the same logical table (the base tableDEPT
). Given the example table, this mapping would result in four triples (assuming an appropriatebase IRI):
<http://data.example.com/department/10> rdf:type ex:Department.<http://data.example.com/department/10> ex:location <http://data.example.com/site/NEW%20YORK>.<http://data.example.com/site/NEW%20YORK> rdf:type ex:Site.<http://data.example.com/site/NEW%20YORK> ex:siteName "NEW YORK".
Each triple generated from anR2RML mapping is placed into one or more graphs of theoutput dataset. Possible target graphs are the unnameddefault graph, and theIRI-namednamed graphs.
Anysubject map orpredicate-object mapMAY have one or more associatedgraph maps. They are specified in one of two ways:
rr:graphMap
property, whose valueMUST be agraph map,rr:graph
.Graph maps are themselvesterm maps. WhenRDF triples are generated, the set of target graphs is determined by taking into account any graph maps associated with the subject map or predicate-object map.
If agraph map generates the special IRIrr:defaultGraph
, then the target graph is thedefault graph of theoutput dataset.
In the followingsubject map example, all generated RDF triples will be stored in the named graphex:DepartmentGraph
.
[] rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}"; rr:graphMap [ rr:constant ex:DepartmentGraph ];].
This is equivalent to the following example, which uses aconstant shortcut property:
[] rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}"; rr:graph ex:DepartmentGraph;].
In the following example, RDF triples are placed into named graphs according to the job title of employees:
[] rr:subjectMap [ rr:template "http://data.example.com/employee/{EMPNO}"; rr:graphMap [ rr:template "http://data.example.com/jobgraph/{JOB}" ];].
The triples generated from theEMP
table would be placed in the named graph with the following IRI:
<http://data.example.com/jobgraph/CLERK>
Blank nodes in theoutput dataset are scoped to a singleRDF graph. If the sameblank node identifier occurs in multipleRDF triples that are in the same graph, then the triples will share the same blank node. If, however, the same blank node identifier occurs in multiple graphs, then a distinct blank node be created for each graph. An R2RML-generated blank node can never be shared by two triples in two different graphs.
This implies that triples generated from a single logical table row will have different subjects if the subjects are blank nodes and the triples are placed into different graphs.
This section defines mappings from SQL data values to RDFliterals.
This section defines the following mappings from SQL data values:
TIMESTAMP
is used in an IRItemplate.rr:datatype
.The mappings cover all predefined Core SQL 2008 datatypes exceptINTERVAL
. The natural mappings may beextended with custom handling for other types, such as vendor-specific SQL datatypes. In the absence of such extensions, the natural mappings fall back on a simplecast to string for all unsupported SQL datatypes.
The mappings are referenced in the R2RMLterm generation rules.
An informativesummary of XSD lexical forms is provided to aid implementers.
Thenatural RDF literal corresponding to a SQL data value is the result of applying the following steps:
CHARACTER
,CHARACTER VARYING
,CHARACTER LARGE OBJECT
,NATIONAL CHARACTER
,NATIONAL CHARACTER VARYING
,NATIONAL CHARACTER LARGE OBJECT
), then the result is aplain literal withoutlanguage tag whoselexical form is the SQL data value.1
,+1
,1.0
and1.0E0
), then the choice is implementation-dependent. However, the choiceMUST be made so that given a target RDF datatype and value, the same lexical form is chosen consistently (e.g.,INTEGER 5
andBIGINT 5
must be mapped to the same lexical form, as both are mapped to the RDF datatypexsd:integer
and are equal values; mapping one to5
and the other to+5
would be an error). Thecanonical lexical representation [XMLSCHEMA2]MAY be chosen. (See also:Summary of XSD Lexical Forms)SQL datatype | RDF datatype | Lexical transformation (informative) |
---|---|---|
BINARY ,BINARY VARYING ,BINARY LARGE OBJECT | xsd:hexBinary | xsd:hexBinary lexical mapping |
NUMERIC ,DECIMAL | xsd:decimal | none required |
SMALLINT ,INTEGER ,BIGINT | xsd:integer | none required |
FLOAT ,REAL ,DOUBLE PRECISION | xsd:double | none required |
BOOLEAN | xsd:boolean | ensure lowercase (true ,false ) |
DATE | xsd:date | none required |
TIME | xsd:time | none required |
TIMESTAMP | xsd:dateTime | replace space character with “T ” |
INTERVAL | undefined | undefined |
R2RML extensions that handle vendor-specific or user-defined datatypes beyond those of Core SQL 2008 are expected to behave as if the table above contained additional rows that associate the SQL datatypes with appropriate RDF-compatible datatypes (e.g., theXML Schema built-in types [XMLSCHEMA2]), and appropriate lexical transformations where required. Note however that future versions of R2RML may also normatively add additional rows to this table.
The translation ofINTERVAL
is left undefined due to the complexity of the translation. [SQL14] describes a translation ofINTERVAL
toxdt:yearMonthDuration
andxdt:dayTimeDuration
.
In [SQL2], the precision of many SQL datatypes is not fixed, but left implementation-defined. Therefore, the mapping to XML Schema datatypes must rely on arbitrary-precision types such asxsd:decimal
,xsd:integer
andxsd:dateTime
. Implementers of the mapping may wish to set upper limits for the supported precision of these XSD types. The XML Schema specification allows suchpartial implementations of infinite datatypes [XMLSCHEMA2], and defines specific minimum requirements.
Thenatural RDF datatype corresponding to a SQL datatype is the value of theRDF datatype column in the row corresponding to the SQL datatype in thetable above.
Thenatural RDF lexical form corresponding to a SQL data value is thelexical form of its correspondingnatural RDF literal, with the additional constraint that thecanonical lexical representation [XMLSCHEMA2]SHOULD be chosen.
Thecanonical RDF lexical form corresponding to a SQL data value is thelexical form of its correspondingnatural RDF literal, with the additional constraint that thecanonical lexical representation [XMLSCHEMA2]MUST be chosen.
Cast to string is an implementation-dependent function that maps SQL data values to equivalent Unicode strings. It is undefined for the following kinds of SQL datatypes: collection types, row types, user-defined types without a user-defined stringCAST
, reference types whose referenced type does not have a user-defined stringCAST
, binary types.
Cast to string is a fallback that handles vendor-specific and user-defined datatypes not supported by the R2RML processor. It can be implemented in a number of ways, including explicit SQL casts (“CAST(value AS VARCHAR(n))
”, wheren is an arbitrary large integer), implicit SQL casts (concatenation with the empty string), or by employing a database access API that presents return values as strings.
Thedatatype-override RDF literal corresponding to a SQL data valuev and adatatype IRIdt, is atyped literal whose lexical form is thenatural RDF lexical form corresponding tov, and whose datatype IRI isdt. If the typed literal isill-typed, then adata error is raised.
Atyped literal isill-typed in R2RML if its datatype IRI denotes avalidatable RDF datatype and itslexical form is not in thelexical space of the RDF datatype identified by itsdatatype IRI. (See also:Summary of XSD Lexical Forms)
The set ofvalidatable RDF datatypes includes all datatypes in theRDF datatype column of thetable of natural datatype mappings, as defined in [XMLSCHEMA2]. This setMAY include implementation-defined additional RDF datatypes.
For example,"X"^^xsd:boolean
is ill-typed becausexsd:boolean
is a validatable RDF datatype in R2RML, and “X
” is not in thelexical space ofxsd:boolean
[XMLSCHEMA2].
The same non-character-string SQL data value can typically be represented in multiple different string forms. For example, theDOUBLE
value 1 can be represented as1
,+1
,1.0
and1.0E0
. This can cause interoperability issues when such values are used in string contexts, for example when using them to generateIRIs. Two IRIs that are character-for-character equivalent, except one contains1
where the other contains1.0
, will not “link up” in an RDF graph – they are two different nodes.
To reduce portability issues arising from such conversions, this specification recommends that implementations convert non-string data values to a canonical form (seenatural RDF lexical form). However, this is not a strict requirement. Therefore, when portability between R2RML implementations is a concern, mapping authorsSHOULD NOT use non-character-string columns in contexts where strings are produced:
rr:column
when IRIs or blank nodes are produced,rr:column
whenrr:language
or anrr:datatype
other than thenatural RDF datatype is used, andrr:template
.In these contexts, if portability is to be maximized, then mapping authorsSHOULD use anR2RML view instead and explicitly convert the non-string column to a string column using an SQL expression.
Note that this is not a problem whennatural RDF literals are generated from such columns, because the resulting literal has a corresponding non-string XSD datatype, and equivalences between different lexical forms within these datatype are well-defined.
Thenatural mappings make reference to various XSD datatypes and require that SQL data values be converted to strings that are appropriate as lexical forms for these datatypes. This subsection gives examples of these lexical forms in order to aid implementers of the mappings. This subsection is non-normative; the normative definitions of the lexical spaces as well as the canonical lexical mappings are found inW3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes [XMLSCHEMA2].
A general approach that may be used for implementing the natural mappings is as follows:
xsd:hexBinary
,xsd:dateTime
andxsd:boolean
.RDF datatype | Non-canonical lexical forms | Canonical lexical forms | Comments |
---|---|---|---|
xsd:hexBinary | 5232524d4c | 5232524D4C | Convert from SQL by applyingxsd:hexBinary lexical mapping. |
xsd:decimal | .224 | 0.224 | |
+001 | 1 | ||
42.0 | 42 | ||
-5.9000 | -5.9 | ||
xsd:integer | -05 | -5 | |
+333 | 333 | ||
00 | 0 | ||
xsd:double | -5.90 | -5.9E0 | Also supportsINF ,-INF ,NaN and-0.0E0 ,but these do not appear in standard SQL. |
+0.00014770215000 | 1.4770215E-4 | ||
+01E+3 | 1.0E3 | ||
100.0 | 1.0E2 | ||
0 | 0.0E0 | ||
xsd:boolean | 1 | true | Must be lowercase. |
0 | false | ||
xsd:date | 2011-08-23 | Dates in SQL don't have timezone offsets. They are optional in XSD. | |
xsd:time | 22:17:34.885+00:00 | 22:17:34.885Z | May or may not have timezone offset. |
22:17:34.000 | 22:17:34 | ||
22:17:34.1+01:00 | 22:17:34.1+01:00 | ||
xsd:dateTime | 2011-08-23T22:17:00.000+00:00 | 2011-08-23T22:17:00Z | May or may not have timezone offset. Convert from SQL by replacing space wiht “ T ”. |
Theoutput dataset of an R2RML mapping is anRDF dataset that contains thegenerated RDF triples for each of thetriples maps of the R2RML mapping. The output datasetMUST NOT contain any otherRDF triples ornamed graphs besides these. However,R2RML processorsMAY provide access to datasets that contain additional triples or graphs beyond those in the output dataset, such as inferred triples or provenance information.
If a table or column is not explicitly referenced in atriples map, then noRDF triples will be generated for that table or column.
ConformingR2RML processorsMAY renameblank nodes when providing access to theoutput dataset. This means that client applications may see actualblank node identifiers that differ from those produced by theR2RML mapping. Client applicationsSHOULD NOT rely on the specific text of the blank node identifier for any purpose.
RDF syntaxes and RDF APIs generally representblank nodes withblank node identifiers. But the characters allowed in blank node identifiers differ between syntaxes, and not all characters occurring in the values produced by aterm map may be allowed, so a bijective mapping function from values to valid blank node identifiers may be required. The details of this mapping function are implementation-dependent, andR2RML processors may have to use different functions for different output syntaxes or access interfaces. Strings matching the regular expression[a-zA-Z_][a-zA-Z_0-9-]*
are valid blank node identifiers in all W3C-recommended RDF syntaxes (as of this document's publication).
RDF datasets may contain emptynamed graphs. R2RML cannot generate such output datasets.
This subsection describes the process ofgenerating RDF triples from atriples map. This process addsRDF triples to theoutput dataset. Each generated triple is placed into one or more graphs of the output dataset.
The generated RDF triples are determined by the following algorithm. R2RML processorsMAY use other means than implementing this algorithm to compute the generated RDF triples, as long as the result is the same.
sm
be thesubject map of the triples maprows
be the result of evaluating theeffective SQL query of thetriples map'slogical table using theSQL connectionclasses
be theclass IRIs ofsm
sgm
be the set ofgraph maps ofsm
row
inrows
, apply the following steps:subject
be thegenerated RDF term that results from applyingsm
torow
subject_graphs
be the set of thegenerated RDF terms that result from applying each term map insgm
torow
class
inclasses
,add triples to the output dataset as follows:Subject:subject
Predicate:rdf:type
Object:class
Target graphs: Ifsgm
is empty:rr:defaultgraph
; otherwise:subject_graphs
predicates
be the set ofgenerated RDF terms that result from applying each of the predicate-object map'spredicate maps torow
objects
be the set ofgenerated RDF terms that result from applying each of the predicate-object map'sobject maps (but notreferencing object maps) torow
pogm
be the set ofgraph maps of the predicate-object mappredicate-object_graphs
be the set ofgenerated RDF terms that result from applying eachgraph map inpogm
torow
predicate
,object
> wherepredicate
is a member ofpredicates
andobject
is a member ofobjects
,add triples to the output dataset as follows:Subject:subject
Predicate:predicate
Object:object
Target graphs: Ifsgm
andpogm
are empty:rr:defaultGraph
; otherwise: union ofsubject_graphs
andpredicate-object_graphs
psm
be thesubject map of theparent triples map of the referencing object mappogm
be the set ofgraph maps of the predicate-object mapn
be the number of columns in the logical table of thetriples maprows
be the result of evaluating thejoint SQL query of the referencing object maprow
inrows
, apply the following steps:child_row
be the logical table row derived by taking the firstn
columns ofrow
parent_row
be the logical table row derived by taking all but the firstn
columns ofrow
subject
be thegenerated RDF term that results from applyingsm
tochild_row
predicates
be the set ofgenerated RDF terms that result from applying each of the predicate-object map'spredicate maps tochild_row
object
be thegenerated RDF term that results from applyingpsm
toparent_row
subject_graphs
be the set ofgenerated RDF terms that result from applying eachgraph map ofsgm
tochild_row
predicate-object_graphs
be the set ofgenerated RDF terms that result from applying eachgraph map inpogm
tochild_row
predicate
inpredicates
,add triples to the output dataset as follows:Subject:subject
Predicate:predicate
Object:object
Target graphs: If neithersgm
norpogm
has anygraph maps:rr:defaultGraph
; otherwise: union ofsubject_graphs
andpredicate-object_graphs
“Add triples to the output dataset” is a process that takes the following inputs:
Execute the following steps:
rr:defaultGraph
, add the triple to thedefault graph of theoutput dataset.rr:defaultGraph
, add the triple to anamed graph of that name in theoutput dataset. If the output dataset does not contain a named graph with that IRI, create it first.RDF graphs cannot contain duplicateRDF triples. Placing multiple equal triples into the same graph has the same effect as placing it into the graph only once. Also note thescope of blank nodes.
Aterm map is a function that generates anRDF term from alogical table row. The result of that function can be:
NULL
value,Thegenerated RDF term of a term map for a given logical table row is determined as follows:
Theterm generation rules, applied to avalue, are as follows:
NULL
, then no RDF term is generated.rr:IRI
:rr:BlankNode
:rr:Literal
:This appendix lists some terms normatively defined in other specifications.
The following terms are defined inRDF Concepts and Abstract Syntax [RDF] and used in R2RML:
The following terms are defined inSPARQL Query Language for RDF [SPARQL] and used in R2RML:
This appendix lists all the classes, properties and other terms defined by this specification within theR2RML vocabulary.
An RDFS representation of the vocabulary is available from thenamespace IRI.
The following table lists allR2RML classes.
The third column contains minimum conditions that a resource has to fulfil in order to be considered member of the class. Where multiple conditions are listed, all must be fulfilled.
Class | Represents | Minimum conditions |
---|---|---|
rr:BaseTableOrView | SQL base table or view | Having anrr:tableName property |
rr:GraphMap | graph map | Being anrr:TermMap Being value of an rr:graphMap property |
rr:Join | join condition | Having anrr:parent propertyHaving an rr:child property |
rr:LogicalTable | logical table | Being one of its subclasses,rr:BaseTableOrView orrr:R2RMLView |
rr:ObjectMap | object map | Being anrr:TermMap Being value of an rr:objectMap property |
rr:PredicateMap | predicate map | Being anrr:TermMap Being value of an rr:predicateMap property |
rr:PredicateObjectMap | predicate-object map | Having at least one ofrr:predicate andrr:predicateMap Having at least one of rr:object andrr:objectMap |
rr:R2RMLView | R2RML view | Having anrr:sqlQuery property |
rr:RefObjectMap | referencing object map | Having anrr:parentTriplesMap property |
rr:SubjectMap | subject map | Being anrr:TermMap Being value of an rr:subjectMap property |
rr:TermMap | term map | Having exactly one ofrr:constant ,rr:column ,rr:template |
rr:TriplesMap | triples map | Having anrr:logicalTable propertyHaving exactly one of rr:subject andrr:subjectMap |
Asnoted earlier, a single node in anR2RML mapping graph may represent multiple mapping components and thus be typed as several of these classes. However, the following classes are disjoint:
rr:TermMap
andrr:RefObjectMap
rr:BaseTableOrView
andrr:SQLQuery
rr:constant
,rr:column
andrr:template
, respectively)The following table lists all properties in theR2RML vocabulary.
The cardinality column indicates how often this property occurs within its context. Note that additional constraints not stated in this table might apply. The actual cardinality of some properties may depend on the presence or absence of other properties, and their values. Properties where this applies are indicated by an exclamation mark.
Term | Denotes | Used with property |
---|---|---|
rr:defaultGraph | default graph | rr:graph |
rr:SQL2008 | Core SQL 2008 | rr:sqlVersion |
rr:IRI | IRI | rr:termType |
rr:BlankNode | blank node | rr:termType |
rr:Literal | literal | rr:termType |
The Editors would like to give special thanks to the following contributors: David McNeil greatly improved the quality of the specification with detailed reviews and comments. Nuno Lopes and Eric Prud'hommeaux contributed to the design of the mapping from SQL data values to RDF literals. Eric also worked on the mechanism for SQL compatibility. Boris Villazón-Terrazas drew the diagrams throughout the text, and kept them up-to-date throughout many iterations.
In addition, the Editors gratefully acknowledge contributions from: Marcelo Arenas, Sören Auer, Samir Batla, Alexander de Leon, Orri Erling, Lee Feigenbaum, Enrico Franconi, Howard Greenblatt, Wolfgang Halb, Harry Halpin, Michael Hausenblas, Patrick Hayes, Ivan Herman, Nophadol Jekjantuk, Li Ma, Nan Ma, Ashok Malhotra, Ivan Mikhailov, Percy Enrique Rivera Salas, Juan Sequeda, Ben Szekely, Ted Thibodeau, and Edward Thomas.