Movatterモバイル変換


[0]ホーム

URL:


W3C

R2RML: RDB to RDF Mapping Language

W3C Recommendation 27 September 2012

This version:
http://www.w3.org/TR/2012/REC-r2rml-20120927/
Latest version:
http://www.w3.org/TR/r2rml/
Previous version:
http://www.w3.org/TR/2012/PR-r2rml-20120814/
Editors:
Souripriya Das, Oracle
Seema Sundara, Oracle
Richard Cyganiak, DERI, National University of Ireland, Galway

Please refer to theerrata for this document, which may include some normative corrections.

See alsotranslations.

Copyright © 2012W3C® (MIT,ERCIM,Keio), All Rights Reserved. W3Cliability,trademark anddocument use rules apply.


Abstract

This document describes R2RML, a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author's choice. R2RML mappings are themselves RDF graphs and written down in Turtle syntax. R2RML enables different types of mapping implementations. Processors could, for example, offer a virtual SPARQL endpoint over the mapped relational data, or generate RDF dumps, or offer a Linked Data interface.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in theW3C technical reports index at http://www.w3.org/TR/.

This document has been reviewed by W3C Members, by software developers, and by other W3C groups and interested parties, and is endorsed by the Director as aW3C Recommendation. It is a stable document and may be used as reference material or cited from another document. W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.

This document was published by theRDB2RDF Working Group. Comments on this document should be sent topublic-rdb2rdf-comments@w3.org, a mailing list with apublic archive. The following related documents have been made available:

This document was produced by a group operating under the5 February 2004 W3C Patent Policy. W3C maintains apublic list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes containsEssential Claim(s) must disclose the information in accordance withsection 6 of the W3C Patent Policy.

Table of Contents


1 Introduction

This specification describes R2RML, a language for expressing customized mappings from relational databases to RDF datasets. Such mappings provide the ability to view existing relational data in the RDF data model, expressed in a structure and target vocabulary of the mapping author's choice.

This specification has a companion that definesa direct mapping from relational databases to RDF [DM]. In the direct mapping of a database, the structure of the resulting RDF graph directly reflects the structure of the database, the target RDF vocabulary directly reflects the names of database schema elements, and neither structure nor target vocabulary can be changed. With R2RML on the other hand, a mapping author can define highly customized views over the relational data.

Every R2RML mapping is tailored to a specific database schema and target vocabulary. The input to an R2RML mapping is a relational database that conforms to that schema. The output is anRDF dataset [SPARQL], as defined in SPARQL, that uses predicates and types from the target vocabulary. The mapping is conceptual; R2RML processors are free to materialize the output data, or to offer virtual access through an interface that queries the underlying database, or to offer any other means of providing access to the output RDF dataset.

R2RML mappings are themselves expressed as RDF graphs and written down inTurtle syntax [TURTLE].

The intended audience of this specification is implementors of software that generates or processes R2RML mapping documents, as well as mapping authors looking for a reference to the R2RML language constructs. The document uses concepts fromRDF Concepts and Abstract Syntax [RDF] and from theSQL language specifications [SQL1][SQL2]. A reader's familiarity with the contents of these documents, as well as with the Turtle syntax, is assumed.

The R2RML language is designed to meet the use cases and requirements identified inUse Cases and Requirements for Mapping Relational Databases to RDF [UCNR].

1.1 Document Conventions

In this document, examples assume the following namespace prefix bindings unless otherwise stated:

PrefixIRI
rr:http://www.w3.org/ns/r2rml#
rdf:http://www.w3.org/1999/02/22-rdf-syntax-ns#
rdfs:http://www.w3.org/2000/01/rdf-schema#
xsd:http://www.w3.org/2001/XMLSchema#
ex:http://example.com/ns#

Throughout the document, boxes containing Turtle markup and SQL data will appear. These boxes are color-coded. Gray boxes contain RDFS definitions of R2RML vocabulary terms:

# This box contains RDFS definitions of R2RML vocabulary terms

Yellow boxes contain example fragments of R2RML mappings in Turtle syntax:

# This box contains example R2RML mappings

Blue tables contain example input into an R2RML mapping:

EXAMPLE
IDINTEGER PRIMARY KEYDESCVARCHAR(100)
1This is an example input table.
2The table name is EXAMPLE.
3It has six rows.
4It has two columns, ID and DESC.
5ID is the table's primary key and of type INTEGER.
6DESC is of type VARCHAR(100)

Green boxes contain example output:

# This box contains example output RDF triples or fragments

2 R2RML Overview and Example (Informative)

This section gives a brief overview of the R2RML mapping language, followed by a simple example relational database with an R2RML mapping document and its output RDF. Further R2RML examples can be found in theR2RML and Direct Mapping Test Cases [TC].

AnR2RML mapping refers tological tables to retrieve data from theinput database. A logical table can be one of the following:

  1. A base table,
  2. a view, or
  3. a valid SQL query (called an “R2RML view” because it emulates a SQL view without modifying the database).

Each logical table is mapped to RDF using atriples map. The triples map is a rule that maps eachrow in the logical table to a number ofRDF triples. The rule has two main parts:

  1. Asubject map that generates the subject of all RDF triples that will be generated from a logical table row. The subjects often areIRIs that are generated from the primary key column(s) of the table.
  2. Multiplepredicate-object maps that in turn consist ofpredicate maps andobject maps (orreferencing object maps).

Triples are produced by combining the subject map with a predicate map and object map, and applying these three to eachlogical table row. For example, the complete rule for generating a set of triples might be:

By default, allRDF triples are in thedefault graph of theoutput dataset. A triples map can containgraph maps that place some or all of the triples intonamed graphs instead.

UML overview diagram of R2RML

Figure 1: An overview of R2RML

2.1 Example Input Database

The following example database consists of two tables,EMP andDEPT, with one row each:

EMP
EMPNOINTEGER PRIMARY KEYENAMEVARCHAR(100)JOBVARCHAR(20)DEPTNOINTEGER REFERENCES DEPT (DEPTNO)
7369SMITHCLERK10
DEPT
DEPTNOINTEGER PRIMARY KEYDNAMEVARCHAR(30)LOCVARCHAR(100)
10APPSERVERNEW YORK

2.2 Desired RDF Output

The desired RDF triples to be produced from this database are as follows:

<http://data.example.com/employee/7369> rdf:type ex:Employee.<http://data.example.com/employee/7369> ex:name "SMITH".<http://data.example.com/employee/7369> ex:department <http://data.example.com/department/10>.<http://data.example.com/department/10> rdf:type ex:Department.<http://data.example.com/department/10> ex:name "APPSERVER".<http://data.example.com/department/10> ex:location "NEW YORK".<http://data.example.com/department/10> ex:staff 1.

Note in particular:

2.3 Example: Mapping a Simple Table

The following partial R2RML mapping document will produce the desired triples from theEMP table (except theex:department triple, which will be added later):

@prefix rr: <http://www.w3.org/ns/r2rml#>.@prefix ex: <http://example.com/ns#>.<#TriplesMap1>    rr:logicalTable [ rr:tableName "EMP" ];    rr:subjectMap [        rr:template "http://data.example.com/employee/{EMPNO}";        rr:class ex:Employee;    ];    rr:predicateObjectMap [        rr:predicate ex:name;        rr:objectMap [ rr:column "ENAME" ];    ].
<http://data.example.com/employee/7369> rdf:type ex:Employee.<http://data.example.com/employee/7369> ex:name "SMITH".

2.4 Example: Computing a Property with an R2RML View

Next, theDEPT table needs to be mapped. Instead of using the table directly as the basis for that mapping, an “R2RML view” will be defined based on a SQL query. This allows computation of the staff number. (Alternatively, one could define this view directly in the database.)

<#DeptTableView> rr:sqlQuery """SELECT DEPTNO,       DNAME,       LOC,       (SELECT COUNT(*) FROM EMP WHERE EMP.DEPTNO=DEPT.DEPTNO) AS STAFFFROM DEPT;""".

The definition of a triples map that generates the desiredDEPT triples based on this R2RML view follows.

<#TriplesMap2>    rr:logicalTable <#DeptTableView>;    rr:subjectMap [        rr:template "http://data.example.com/department/{DEPTNO}";        rr:class ex:Department;    ];    rr:predicateObjectMap [        rr:predicate ex:name;        rr:objectMap [ rr:column "DNAME" ];    ];    rr:predicateObjectMap [        rr:predicate ex:location;        rr:objectMap [ rr:column "LOC" ];    ];    rr:predicateObjectMap [        rr:predicate ex:staff;        rr:objectMap [ rr:column "STAFF" ];    ].
<http://data.example.com/department/10> rdf:type ex:Department.<http://data.example.com/department/10> ex:name "APPSERVER".<http://data.example.com/department/10> ex:location "NEW YORK".<http://data.example.com/department/10> ex:staff 1.

2.5 Example: Linking Two Tables

To complete the mapping document, theex:department triples need to be generated. Their subjects come from the first triples map (<#TriplesMap1>), the objects come from the second triples map (<#TriplesMap2>).

This can be achieved by adding anotherrr:predicateObjectMap to<#TriplesMap1>. This one uses the other triples map,<#TriplesMap2>, as aparent triples map:

<#TriplesMap1>    rr:predicateObjectMap [        rr:predicate ex:department;        rr:objectMap [            rr:parentTriplesMap <#TriplesMap2>;            rr:joinCondition [                rr:child "DEPTNO";                rr:parent "DEPTNO";            ];        ];    ].

This performs a join between theEMP table and the R2RML view, on theDEPTNO columns. The objects will be generated from the subject map of the parent triples map, yielding the desired triple:

<http://data.example.com/employee/7369> ex:department <http://data.example.com/department/10>.

This completes the R2RML mapping document. An R2RML processor will generate the triples listed above from this mapping document.

2.6 Example: Many-to-Many Tables

The following example will assume that a many-to-many relationship exists between the extended versions ofEMP table and theDEPT table shown below. This many-to-many relationship is captured by the content of theEMP2DEPT table. The database consisting of theEMP,DEPT, andEMP2DEPT tables are shown below:

EMP
EMPNOINTEGER PRIMARY KEYENAMEVARCHAR(100)JOBVARCHAR(20)
7369SMITHCLERK
7369SMITHNIGHTGUARD
7400JONESENGINEER
DEPT
DEPTNOINTEGER PRIMARY KEYDNAMEVARCHAR(30)LOCVARCHAR(100)
10APPSERVERNEW YORK
20RESEARCHBOSTON
EMP2DEPTPRIMARY KEY (EMPNO, DEPTNO)
EMPNOINTEGER REFERENCES EMP (EMPNO)DEPTNOINTEGER REFERENCES DEPT (DEPTNO)
736910
736920
740010
<http://data.example.com/employee=7369/department=10>     ex:employee   <http://data.example.com/employee/7369> ;    ex:department <http://data.example.com/department/10> .<http://data.example.com/employee=7369/department=20>     ex:employee <http://data.example.com/employee/7369> ;    ex:department <http://data.example.com/department/20> .<http://data.example.com/employee=7400/department=10>     ex:employee <http://data.example.com/employee/7400> ;    ex:department <http://data.example.com/department/10> .

The following R2RML mapping will produce the desired triples listed above:

<#TriplesMap3>    rr:logicalTable [ rr:tableName "EMP2DEPT" ];    rr:subjectMap [ rr:template "http://data.example.com/employee={EMPNO}/department={DEPTNO}" ];    rr:predicateObjectMap [        rr:predicate ex:employee;        rr:objectMap [ rr:template "http://data.example.com/employee/{EMPNO}" ];    ];    rr:predicateObjectMap [        rr:predicate ex:department;        rr:objectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ];    ].

However, if one doesnot require that the subjects in the desired output uniquely identify the rows in theEMP2DEPT table, the desired output may look as follows:

<http://data.example.com/employee/7369>     ex:department <http://data.example.com/department/10> ;    ex:department <http://data.example.com/department/20> .<http://data.example.com/employee/7400>     ex:department <http://data.example.com/department/10>.

The following R2RML mapping will produce the desired triples:

<#TriplesMap3>    rr:logicalTable [ rr:tableName "EMP2DEPT" ];    rr:subjectMap [        rr:template "http://data.example.com/employee/{EMPNO}";    ];    rr:predicateObjectMap [      rr:predicate ex:department;      rr:objectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ];    ].

2.7 Example: Translating database type codes to IRIs

Sometimes, database columns contain codes that need to be translated into IRIs, but a direct syntactic translation usingstring templates is not possible. For example, consider aJOB column in theEMP table with the following possible values, and IRIs corresponding to those database values in the RDF output:

ValueCorresponding RDF IRI
CLERKhttp://data.example.com/roles/general-office
NIGHTGUARDhttp://data.example.com/roles/security
ENGINEERhttp://data.example.com/roles/engineering

The IRIs are not found in the original database and therefore the mapping from database codes to IRIs has to be specified in the R2RML mapping. Such translations can be achieved using an “R2RML view”. The view is defined based on a SQL query that computes the IRI based on the database value. SQL'sCASE statement is convenient for this purpose. (Alternatively, one could define this view directly in the database.)

<#TriplesMap1>    rr:logicalTable [ rr:sqlQuery """        SELECT EMP.*, (CASE JOB            WHEN 'CLERK' THEN 'general-office'            WHEN 'NIGHTGUARD' THEN 'security'            WHEN 'ENGINEER' THEN 'engineering'        END) ROLE FROM EMP        """ ];    rr:subjectMap [        rr:template "http://data.example.com/employee/{EMPNO}";    ];    rr:predicateObjectMap [        rr:predicate ex:role;        rr:objectMap [ rr:template "http://data.example.com/roles/{ROLE}" ];    ].

With theexample input database, this mapping would yield the following triple:

<http://data.example.com/employee/7369> ex:role <http://data.example.com/roles/general-office>.

3 Conformance

As well as sections marked as non-normative in the section heading, all diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key wordsmust,must not,required,should,should not,recommended,may, andoptional in this specification are to be interpreted as described inRFC 2119 [RFC2119].

This specification describes conformance criteria for:

A collection of test cases for R2RML processors and R2RML data validators is available in theR2RML and Direct Mapping Test Cases [TC].

This specification defines R2RML for databases that conform toCore SQL 2008, as defined inISO/IEC 9075-1:2008 [SQL1] andISO/IEC 9075-2:2008 [SQL2]. Processors and mappings may have to deviate from the R2RML specification in order to support databases that do not conform to this version of SQL.

Where SQL queries are embedded into R2RML mappings,SQL version identifiers can be used to indicate the specific version of SQL that is being used.

4 R2RML Processors and Mapping Documents

AnR2RML mapping defines a mapping from a relational database to RDF. It is a structure that consists of one or moretriples maps.

The input to an R2RML mapping is called theinput database.

AnR2RML processor is a system that, given anR2RML mapping and aninput database, provides access to theoutput dataset.

There are no constraints on the method of access to the output dataset provided by a conforming R2RML processor. An R2RML processorMAY materialize the output dataset into a file, or offer virtual access through an interface that queries the input database, or offer any other means of providing access to the output dataset.

AnR2RML processor also has access to an execution environment consisting of:

TheSQL connection is used by the R2RML processor to evaluate SQL queries against the input database. ItMUST be established with sufficient privileges for read access to all base tables and views that are referenced in the R2RML mapping. ItMUST be configured with adefault catalog anddefault schema that will be used when tables and views are accessed without an explicit catalog or schema reference.

How the SQL connection is established, or how users are authenticated against the database, is outside of the scope of this document.

Thebase IRIMUST be a validIRI. ItSHOULD NOT contain question mark (“?”) or hash (“#”) characters andSHOULD end in a slash (“/”) character.

To obtain an absolute IRI from a relative IRI, theterm generation rules of R2RML use simple string concatenation, rather than the more complex algorithm for resolution of relative URIs defined inSection 5.2 of [RFC3986]. This ensures that the original database value can be reconstructed from the generated absolute IRI. Both algorithms are equivalent if all of the following are true:

  1. The base IRI does not contain question marks or hashes,
  2. the base IRI ends in a slash,
  3. the relative IRI does not start with a slash, and
  4. the relative IRI does not contain any “.” or “..” path segments.

AnR2RML data validator is a system that takes as its input anR2RML mapping, abase IRI, and aSQL connection to aninput database, and checks for the presence ofdata errors. When checking the input database, a data validatorMUST report any data errors that are raised in the process of generating the output dataset.

AnR2RML processorMAY include anR2RML data validator, but this is not required.

4.1 Mapping Graphs and the R2RML Vocabulary

AnR2RML mapping is represented as anRDF graph. In other words, RDF is used not just as the target data model of the mapping, but also as a formalism for representing the R2RML mapping itself.

AnRDF graph that represents anR2RML mapping is called anR2RML mapping graph.

TheR2RML vocabulary is the set ofIRIs defined in this specification that start with therr: namespace IRI:

http://www.w3.org/ns/r2rml#

AnR2RML mapping graph:

TheR2RML vocabulary also includes the followingR2RML classes:

The members of these classes are collectively calledmapping components.

Many of these classes differ only in capitalization from properties in theR2RML vocabulary.

Explicit typing of the resources in a mapping graph withR2RML classes isOPTIONAL and has no effect on the behaviour of anR2RML processor. Themapping component represented by any given resource in a mapping graph is defined by the presence or absence of certain properties, as defined throughout this specification. A resourceSHOULD NOT be typed as an R2RML class if it does not meet the definition of that class.

4.2 RDF-based Turtle Syntax; Media Type

AnR2RML mapping document is any document written in theTurtle [TURTLE] RDF syntax that encodes anR2RML mapping graph.

The media type forR2RML mapping documents is the same as for Turtle documents in general:text/turtle. The content encoding of Turtle content is always UTF-8 and thecharset parameter on the media typeSHOULD always be used:text/turtle;charset=utf-8. The file extension.ttlSHOULD be used.

A conformingR2RML processorSHOULD acceptR2RML mapping documents in Turtle syntax. ItMAY acceptR2RML mapping graphs encoded in other RDF syntaxes.

4.3 Data Errors

Adata error is a condition of the data in theinput database that would lead to the generation of an invalidRDF term. The following conditions give rise to data errors:

  1. Aterm map withterm typerr:IRI results in thegeneration of an invalidIRI.
  2. Aterm map whosenatural RDF datatype is overridden with aspecified datatype produces anill-typed literal (seedatatype-override RDF literal).

When providing access to theoutput dataset, anR2RML processorMUST abort any operation that requires inspecting or returning anRDF term whose generation would give rise to adata error, and report an error to the agent invoking the operation. A conformingR2RML processorMAY, however, allow other operations that do not require inspecting or returning theseRDF terms, and thusMAY provide partial access to anoutput dataset that contains data errors. Nevertheless, anR2RML processorSHOULD report data errors as early as possible.

The presence ofdata errors does not make anR2RML mapping non-conforming.

Data errors cannot generally be detected by analyzing the table schema of the database, but only by scanning the data in the tables. For large and rapidly changing databases, this can be impractical. Therefore,R2RML processors are allowed to answer queries that do not “touch” a data error, and the behavior of such operations is well-defined. For the same reason, the conformance ofR2RML mappings is defined without regard for the presence of data errors.

R2RML data validators can be used to explicitly scan a database for data errors.

4.4 Default Mappings

AnR2RML processorMAY include anR2RML default mapping generator. This is a facility that introspects the schema of theinput database and generates anR2RML mapping, possibly in the form of anR2RML mapping document, intended for further customization by a mapping author. Such a mapping is known as adefault mapping.

Thedefault mappingSHOULD be such that its output is theDirect Graph [DM] corresponding to theinput database.

Duplicate row preservation: For tables without a primary key, theDirect Graph requires that a freshblank node is created for each row. This ensures that duplicate rows in such tables are preserved. This requirement is relaxed for R2RMLdefault mappings: TheyMAY reuse the same blank node for multiple duplicate rows. This behaviour does not preserve duplicate rows.R2RML default mapping generators that provide default mappings based on the Direct GraphMUST document whether the generateddefault mappingpreserves duplicate rows or not.

5 Defining Logical Tables

Diagram: The properties of logical tables

Figure 2: The properties of logical tables

Alogical table is a tabular SQL query result that is to be mapped toRDF triples. A logical table is either

Every logical table has aneffective SQL query that, if executed over theSQL connection, produces as its result the contents of the logical table.

Alogical table row is a row in alogical table.

Acolumn name is the name of a column of alogical table. A column nameMUST be a validSQL identifier. Column names do not include any qualifying table, view or schema names.

ASQL identifier is the name of a SQL object, such as a column, table, view, schema, or catalog. A SQL identifierMUST match the<identifier> production in [SQL2]. When comparing identifiers for equality, the comparison rules of [SQL2]MUST be used.

An informative summary of SQL identifier syntax rules:
  1. SQL identifiers can be delimited identifiers (with double quotes), or regular identifiers.
  2. Regular identifiers must start with a Unicode character from any of the following character classes: upper-case letter, lower-case letter, title-case letter, modifier letter, other letter, or letter number. Subsequent characters may be any of these, or a nonspacing mark, spacing combining mark, decimal number, connector punctuation, and formatting code.
  3. Regular identifiers are case-insensitive.
  4. Delimited identifiers can contain any character.
  5. A double-quote character inside a delimited identifier is escaped by appending a second double-quote character.
  6. Delimited identifiers are case-sensitive.
  7. deptno and"deptno" are not equivalent (delimited identifiers that are not all-upper-case are not equivalent to any undelimited identifiers).
  8. DEPTNO and"DEPTNO" are equivalent (all-upper-case delimited and undelimited identifiers are equivalent).
  9. Five examples of valid column names:deptno,dept_no,"dept_no","Department Number","Identifier ""with quotes""".
Note that Turtle string syntax requires escaping of double quotes with a backslash, so the identifiers from the list above might be written like this if occurring inside an R2RML mapping document:
[] rr:column "deptno".[] rr:column "dept_no".[] rr:column "\"dept_no\"".[] rr:column "\"Department Number\"".[] rr:column "\"Identifier \"\"with quotes\"\"\"".
These rules are forCore SQL 2008. SeeSection 3,Conformance regarding databases that do not conform to this version of SQL.

5.1 Base Tables and SQL Views (rr:tableName)

ASQL base table or view is alogical table containing SQL data from a base table or view in theinput database. A SQL base table or view is represented by a resource that has exactly onerr:tableName property.

The value ofrr:tableName specifies thetable or view name of the base table or view. Its valueMUST be a validschema-qualified name that names an existing base table or view in theinput database.

Aschema-qualified name is a sequence of one, two or three validSQL identifiers, separated by the dot character (“.”). The three identifiers name, respectively, a catalog, a schema, and a table or view. If no catalog or schema is specified, then thedefault catalog anddefault schema of theSQL connection are assumed.

Theeffective SQL query of aSQL base table or view is:

SELECT * FROM{table}

with{table} replaced with thetable or view name.

The following example shows a logical table specified using a schema-qualified table name.

[] rr:tableName "SCOTT.DEPT".

The following example shows a logical table specified using an unqualified table name. The SQL connection's default schema will be used.

[] rr:tableName "DEPT".

5.2 R2RML Views (rr:sqlQuery,rr:sqlVersion)

AnR2RML view is alogical table whose contents are the result of executing a SQL query against theinput database. It is represented by a resource that has exactly onerr:sqlQuery property, whose value is aliteral with alexical form that is a validSQL query.

R2RML mappings sometimes require data transformation, computation, or filtering before generating triples from the database. This can be achieved by defining a SQL view in theinput database and referring to it withrr:tableName. However, this approach may sometimes not be practical for lack of database privileges or other reasons.R2RML views achieve the same effect without requiring changes to the input database.

Note that unlike “real” SQL views, an R2RML view can not be used as an input table in further SQL queries.

ASQL query is aSELECT query in the SQL language that can be executed over theinput database. The stringMUST conform to the production<direct select statement: multiple rows> in [SQL2] with anOPTIONAL trailing semicolon character andOPTIONAL surrounding white space (excluding comments) as defined in [TURTLE]. ItMUST be valid to execute over theSQL connection. The result of the query executionMUST NOT have duplicatecolumn names. Any columns in theSELECT list derived by projecting an expressionSHOULD be named, because otherwise they cannot be reliably referenced in the rest of the mapping.

Database objects referenced in the SQL queryMAY be qualified with a catalog or schema name. For any database objects referenced without an explicit catalog name or schema name, thedefault catalog anddefault schema of theSQL connection are assumed.

For example, the followingSELECT query isnot a valid R2RMLSQL query because the result contains a duplicate column nameDEPTNO:

SELECT EMP.DEPTNO, 1 AS DEPTNO FROM EMP;

As a further example, the followingSELECT querySHOULD NOT be used, because it contains an unnamed column derived from aCOUNT expression:

SELECT DEPTNO, COUNT(EMPNO) FROM EMP GROUP BY DEPTNO;

AnR2RML viewMAY have one or moreSQL version identifiers. TheyMUST be validIRIs and are represented as values of therr:sqlVersion property. The followingSQL version identifier indicates that the SQL query conforms to Core SQL 2008:

http://www.w3.org/ns/r2rml#SQL2008

The absence of aSQL version identifier indicates that no claim to Core SQL 2008 conformance is made.

No further identifiers besidesrr:SQL2008 are defined in this specification. The RDB2RDF Working Group intends to maintain a non-normativelist of identifiers for other SQL versions [SQLIRIS].

Theeffective SQL query of anR2RML view is the value of itsrr:sqlQuery property.

The following example shows a logical table specified as an R2RML view conforming to Core SQL 2008.

[] rr:sqlQuery """        Select ('Department' || DEPTNO) AS DEPTID             , DEPTNO             , DNAME             , LOC          from SCOTT.DEPT    """;    rr:sqlVersion rr:SQL2008.

6 Mapping Logical Tables to RDF with Triples Maps

Diagram: The properties of triples maps

Figure 3: The properties of triples maps

Atriples map specifies a rule for translating each row of alogical table to zero or moreRDF triples.

The RDF triples generated from one row in the logical table all share the same subject.

A triples map is represented by a resource that references the following other resources:

Thereferenced columns of allterm maps of a triples map (subject map, predicate maps, object maps, graph maps)MUST becolumn names that exist in the term map'slogical table.

The following example shows atriples map including its logical table, subject map, and two predicate-object maps.

[]    rr:logicalTable [ rr:tableName "DEPT" ];    rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ];    rr:predicateObjectMap [        rr:predicate ex:name;        rr:objectMap [ rr:column "DNAME" ];    ];    rr:predicateObjectMap [        rr:predicate ex:location;        rr:objectMap [ rr:column "LOC" ];    ].

6.1 Creating Resources with Subject Maps

Asubject map is aterm map. It specifies a rule for generating the subjects of theRDF triples generated by atriples map.

6.2 Typing Resources (rr:class)

Asubject mapMAY have one or moreclass IRIs. They are represented by therr:class property. The values of therr:class propertyMUST beIRIs. For eachRDF term generated by the subject map,RDF triples with predicaterdf:type and the class IRI as object will be generated.

Mappings where the class IRI is not constant, but needs to be computed based on the contents of the database, can be achieved by defining apredicate-object map with predicaterdf:type and a non-constantobject map.

In the following example, the generated subject will be asserted as an instance of theex:Employee class.

[] rr:template "http://data.example.com/employee/{EMPNO}";    rr:class ex:Employee.

Using the exampleEMP table, the following RDF triple will be generated:

<http://data.example.com/emp/7369> rdf:type ex:Employee.

6.3 Creating Properties and Values with Predicate-Object Maps

Apredicate-object map is a function that creates one or more predicate-object pairs for eachlogical table row of alogical table. It is used in conjunction with asubject map to generateRDF triples in atriples map.

Apredicate-object map is represented by a resource that references the following other resources:

Apredicate map is aterm map.

Anobject map is aterm map.

7 Creating RDF Terms with Term Maps

Diagram: The properties of term maps

Figure 4: The properties of term maps

AnRDF term is either anIRI, or ablank node, or aliteral.

Aterm map is a function that generates anRDF term from alogical table row. The result of that function is known as the term map'sgenerated RDF term.

Term maps are used to generate the subjects, predicates and objects of theRDF triples that are generated by atriples map. Consequently, there are several kinds ofterm maps, depending on where in the mapping they occur:subject maps,predicate maps,object maps andgraph maps.

Aterm mapMUST be exactly one of the following:

Thereferenced columns of aterm map are the set ofcolumn names referenced in the term map and depend on the type of term map.

7.1 Constant RDF Terms (rr:constant)

Aconstant-valued term map is aterm map that ignores thelogical table row and always generates the same RDF term. A constant-valued term map is represented by a resource that has exactly onerr:constant property.

Theconstant value of aconstant-valued term map is the RDF term that is the value of itsrr:constant property.

If theconstant-valued term map is asubject map,predicate map orgraph map, then itsconstant valueMUST be anIRI.

If theconstant-valued term map is anobject map, then itsconstant valueMUST be anIRI orliteral.

Thereferenced columns of aconstant-valued term map is the empty set.

Constant-valued term maps can be expressed more concisely using theconstant shortcut propertiesrr:subject,rr:predicate,rr:object andrr:graph. Occurrences of these propertiesMUST be treated exactly as if the following triples were present in the mapping graph instead:

Triple involving constant shortcut propertyReplacement triples
?x rr:subject?y.?x rr:subjectMap [ rr:constant?y ].
?x rr:predicate?y.?x rr:predicateMap [ rr:constant?y ].
?x rr:object?y.?x rr:objectMap [ rr:constant?y ].
?x rr:graph?y.?x rr:graphMap [ rr:constant?y ].

The following example shows apredicate-object map that uses a constant-valued term map both for its predicate and for its object.

[] rr:predicateMap [ rr:constant rdf:type ];   rr:objectMap [ rr:constant ex:Employee ].

If added to atriples map, this predicate-object map would add the following triple to all resources?x generated by the triples map:

?x rdf:type ex:Employee.

The following example usesconstant shortcut properties and is equivalent to the example above:

[] rr:predicate rdf:type;   rr:object ex:Employee.

7.2 From a Column (rr:column)

Acolumn-valued term map is aterm map that is represented by a resource that has exactly onerr:column property.

The value of therr:column propertyMUST be a validcolumn name. Thecolumn value of the term map is the data value of that column in a givenlogical table row.

Thereferenced columns of acolumn-valued term map is the singleton set containing the value of the term map'srr:column property.

The following example defines anobject map that generatesliterals from theDNAME column of some logical table.

[] rr:objectMap [ rr:column "DNAME" ].

Using the sample row from theDEPT table as a logical table row, thecolumn value of the object map would be “APPSERVER”.

7.3 From a Template (rr:template)

Atemplate-valued term map is aterm map that is represented by a resource that has exactly onerr:template property. The value of therr:template propertyMUST be a validstring template.

Astring template is a format string that can be used to build strings from multiple components. It can referencecolumn names by enclosing them in curly braces (“{” and “}”). The following syntax rules apply to valid string templates:

Thetemplate value of the term map for a givenlogical table row is determined as follows:

  1. Letresult be thetemplate string
  2. For each pair of unescaped curly braces inresult:
    1. Letvalue be the data value of the column whose name is enclosed in the curly braces
    2. Ifvalue isNULL, then returnNULL
    3. Letvalue be thenatural RDF lexical form corresponding tovalue
    4. If theterm type isrr:IRI, then replace the pair of curly braces with anIRI-safe version ofvalue; otherwise, replace the pair of curly braces withvalue
  3. Returnresult

TheIRI-safe version of a string is obtained by applying the following transformation to any character that is not in theiunreserved production in [RFC3987]:

  1. Convert the character to a sequence of one or more octets usingUTF-8 [RFC3629]
  2. Percent-encode each octet [RFC3986]

The following table shows examples of strings and their IRI-safe versions:

StringIRI-safe version
4242
Hello World!Hello%20World%21
2011-08-23T22:17:00Z2011-08-23T22%3A17%3A00Z
~A_17.1-2~A_17.1-2
葉篤正葉篤正

R2RML always performs percent-encoding when IRIs are generated from string templates. If IRIs need to be generated without percent-encoding, thenrr:column should be used instead ofrr:template, with anR2RML view that performs the string concatenation.

In the case of string templates that generate IRIs, any single character that is legal in an IRI, but percent-encoded in theIRI-safe version of a data value, is asafe separator. This includes in particular the elevensub-delim characters defined in [RFC3987]:!$&'()*+,;=

Thereferenced columns of atemplate-valued term map is the set ofcolumn names enclosed in unescaped curly braces in thetemplate string.

The following example defines asubject map that generatesIRIs from theDEPTNO column of a logical table.

[] rr:subjectMap [ rr:template "http://data.example.com/department/{DEPTNO}" ].

Using the sample row from theDEPT table as a logical table row, thetemplate value of the subject map would be:

http://data.example.com/department/10

The following example shows how anIRI-safe template value is created:

[] rr:subjectMap [ rr:template "http://data.example.com/site/{LOC}" ].

Using the sample row from theDEPT table as a logical table row, thetemplate value of the subject map would be:

http://data.example.com/site/NEW%20YORK

The space character is not in theiunreserved set, and therefore percent-encoding is applied to the character, yielding “%20”.

The following example shows the use of backslash escapes in string templates. The template will generate a fancy title such as

{{{ \o/ Hello World! \o/ }}}

from a string “Hello World!” in theTITLE column. By default,rr:template generates IRIs. Since the intention here is to create a literal instead, theterm type has to be set.

[] rr:objectMap [    rr:template "\\{\\{\\{ \\\\o/ {TITLE} \\\\o/ \\}\\}\\}";    rr:termType rr:Literal;].

Note that because backslashes need to be escaped by a second backslash in the Turtle syntax [TURTLE], a double backslash is needed to escape each curly brace, and to get one literal backslash in the output one needs to write four backslashes in the template.

7.4 IRIs, Literal, Blank Nodes (rr:termType)

Theterm type of acolumn-valued term map ortemplate-valued term map determines the kind ofgenerated RDF term (IRIs,blank nodes orliterals).

If the term map has an optionalrr:termType property, then itsterm type is the value of that property. The valueMUST be an IRI andMUST be one of the following options:

If the term map does not have arr:termType property, then itsterm type is:

Term maps with term typerr:IRI causedata errors if the value is not a validIRI (seegenerated RDF term for details). Data values from the input database may require percent-encoding before they can be used in IRIs.Template-valued term maps are a convenient way of percent-encoding data values.

Constant-valued term maps are not considered as having aterm type, and specifyingrr:termType on these term maps has no effect. The type of the generated RDF term is determined directly by the value ofrr:constant: If it is an IRI, then an IRI will be generated; if it is a literal, a literal will be generated.

7.5 Language Tags (rr:language)

Aterm map with aterm type ofrr:LiteralMAY have aspecified language tag. It is represented by therr:language property on a term map. If present, its valueMUST be a validlanguage tag.

A specified language tag causes generated literals to be language-tagged plain literals. In the following example, plain literals with language tag “en-us” (U.S. English) will be generated for the data values in theDNAME column.

[] rr:objectMap [ rr:column "DNAME"; rr:language "en-us" ].

7.6 Typed Literals (rr:datatype)

Adatatypeable term map is aterm map with aterm type ofrr:Literal that does not have aspecified language tag.

Datatypeable term maps may generatetyped literals. The datatype of these literals can be automatically determined based on the SQL datatype of the underlying logical table column (producing anatural RDF literal), or it can be explicitly overridden usingrr:datatype (producing adatatype-override RDF literal).

Adatatypeable term mapMAY have arr:datatype property. Its valueMUST be anIRI. This IRI is thespecified datatype of the term map.

A term mapMUST NOT have more than onerr:datatype value.

A term map that is not adatatypeable term mapMUST NOT have anrr:datatype property.

Theimplicit SQL datatype of adatatypeable term map isCHARACTER VARYING if the term map is atemplate-valued term map; otherwise, it is the SQL datatype of the respective column in thelogical table row.

Seegenerated RDF term for further details on generating literals from term maps.

One cannot explicitly state that aplain literal withoutlanguage tag should be generated. They are the default for string columns. To generate one from a non-string column, atemplate-valued term map with a template such as"{MY_COLUMN}" and aterm type ofrr:Literal can be used.

The following example shows anobject map that overrides the default datatype of the logical table with an explicitly specifiedxsd:positiveInteger type. Adatatype-override RDF literal of that datatype will be generated from whatever is in theEMPNO column.

[] rr:objectMap [ rr:column "EMPNO"; rr:datatype xsd:positiveInteger ].

7.7 Inverse Expressions (rr:inverseExpression)

Aninverse expression is astring template associated with acolumn-valued term map ortemplate-value term map. It is represented by the value of therr:inverseExpression property. This property isOPTIONAL and thereMUST NOT be more than one for a term map.

Inverse expressions are useful for optimizingterm maps that reference derived columns inR2RML views. An inverse expression specifies an expression that allows “reversing” of agenerated RDF term and the construction of a SQL query that efficiently retrieves thelogical table row from which the term was generated. In particular, it allows the use of indexes on the underlying relational tables.

Every pair of unescaped curly braces in the inverse expression is acolumn reference in an inverse expression. The string between the bracesMUST be a validcolumn name.

Aninverse expressionMUST satisfy the following condition:

For example, for theDEPTID column in thelogical table used for mapping theDEPT table inthis example mapping, an inverse expression could be defined as follows:

[] rr:column "DEPTID";   rr:inverseExpression "{DEPTNO} = SUBSTRING({DEPTID}, CHARACTER_LENGTH('Department')+1)";

This facilitates the use of an existing index on theDEPTNO column of theDEPT table.

Aquoted and escaped data value is any SQL string that matches the<literal> or<null specification> productions of [SQL2]. This string can be used in a SQL query to specify a SQL data value. Examples:

8 Foreign Key Relationships among Logical Tables (rr:parentTriplesMap,rr:joinCondition,rr:child andrr:parent)

Diagram: The properties of referencing object maps

Figure 5: The properties of referencing object maps

Areferencing object map allows using the subjects of anothertriples map as the objects generated by apredicate-object map. Since both triples maps may be based on differentlogical tables, this may require a join between the logical tables. This is not restricted to 1:1 joins.

A referencing object map is represented by a resource that:

Ajoin condition is represented by a resource that has exactly one value for each of the following two properties:

Thechild query of areferencing object map is theeffective SQL query of thelogical table of theterm map containing the referencing object map.

Theparent query of areferencing object map is theeffective SQL query of thelogical table of itsparent triples map.

If thechild query andparent query of areferencing object map are not identical, then the referencing object mapMUST have at least onejoin condition.

Thejoint SQL query of areferencing object map is:

Thejoint SQL query is used whengenerating RDF triples fromreferencing object maps.

The following example shows a referencing object map as part of apredicate-object map:

[] rr:predicateObjectMap [    rr:predicate ex:department;    rr:objectMap [        rr:parentTriplesMap <#TriplesMap2>;        rr:joinCondition [            rr:child "DEPTNO";            rr:parent "DEPTNO";        ];    ];].

If the logical table of the surrounding triples map isEMP, and the logical table of<#TriplesMap2> isDEPT, this would result in a join between these two tables with the condition

EMP.DEPTNO = DEPT.DEPTNO

and the objects of the triples would be generated using the subject map of<#TriplesMap2>.

Given the twoexample tables, and subject maps as defined in theexample mapping, this would result in a triple:

<http://data.example.com/employee/7369> ex:department <http://data.example.com/department/10>.

The following example shows areferencing object map that does not have ajoin condition. It creates two kinds of resources from theDEPT table: departments and sites.

<#DeptTriplesMap>    rr:logicalTable [ rr:tableName "DEPT" ];    rr:subjectMap [        rr:template "department/{DEPTNO}";        rr:class ex:Department;    ];    rr:predicateObjectMap [        rr:predicate ex:location;        rr:objectMap [ rr:parentTriplesMap <#SiteTriplesMap> ];    ].<#SiteTriplesMap>    rr:logicalTable [ rr:tableName "DEPT" ];    rr:subjectMap [        rr:template "site/{LOC}";        rr:class ex:Site;    ];    rr:predicateObjectMap [        rr:predicate ex:siteName;        rr:objectMap [ ex:column "LOC" ];    ].

Anex:Site resource is created for each distinct value in theLOC column, using the<#SiteTriplesMap>. Departments and sites are linked byex:location triples, and the objects of these triples are specified using areferencing object map that references the sites triples map. No join condition is needed as both triples maps use the same logical table (the base tableDEPT). Given the example table, this mapping would result in four triples (assuming an appropriatebase IRI):

<http://data.example.com/department/10> rdf:type ex:Department.<http://data.example.com/department/10> ex:location <http://data.example.com/site/NEW%20YORK>.<http://data.example.com/site/NEW%20YORK> rdf:type ex:Site.<http://data.example.com/site/NEW%20YORK> ex:siteName "NEW YORK".

9 Assigning Triples to Named Graphs

Diagram: The properties of graph maps

Figure 6: The properties of graph maps

Each triple generated from anR2RML mapping is placed into one or more graphs of theoutput dataset. Possible target graphs are the unnameddefault graph, and theIRI-namednamed graphs.

Anysubject map orpredicate-object mapMAY have one or more associatedgraph maps. They are specified in one of two ways:

  1. using therr:graphMap property, whose valueMUST be agraph map,
  2. using theconstant shortcut propertyrr:graph.

Graph maps are themselvesterm maps. WhenRDF triples are generated, the set of target graphs is determined by taking into account any graph maps associated with the subject map or predicate-object map.

If agraph map generates the special IRIrr:defaultGraph, then the target graph is thedefault graph of theoutput dataset.

In the followingsubject map example, all generated RDF triples will be stored in the named graphex:DepartmentGraph.

[] rr:subjectMap [    rr:template "http://data.example.com/department/{DEPTNO}";    rr:graphMap [ rr:constant ex:DepartmentGraph ];].

This is equivalent to the following example, which uses aconstant shortcut property:

[] rr:subjectMap [    rr:template "http://data.example.com/department/{DEPTNO}";    rr:graph ex:DepartmentGraph;].

In the following example, RDF triples are placed into named graphs according to the job title of employees:

[] rr:subjectMap [    rr:template "http://data.example.com/employee/{EMPNO}";    rr:graphMap [ rr:template "http://data.example.com/jobgraph/{JOB}" ];].

The triples generated from theEMP table would be placed in the named graph with the following IRI:

<http://data.example.com/jobgraph/CLERK>

9.1 Scope of Blank Nodes

Blank nodes in theoutput dataset are scoped to a singleRDF graph. If the sameblank node identifier occurs in multipleRDF triples that are in the same graph, then the triples will share the same blank node. If, however, the same blank node identifier occurs in multiple graphs, then a distinct blank node be created for each graph. An R2RML-generated blank node can never be shared by two triples in two different graphs.

This implies that triples generated from a single logical table row will have different subjects if the subjects are blank nodes and the triples are placed into different graphs.

10 Datatype Conversions

This section defines mappings from SQL data values to RDFliterals.

10.1 Introduction (Informative)

This section defines the following mappings from SQL data values:

  1. Thenatural RDF literal is a mapping toliterals. It is used in R2RML and in theDirect Mapping of Relational Data to RDF [DM] as the default mapping when literals are created. It maps SQL datatypes to corresponding XML Schema datatypes [XMLSCHEMA2] and loosely followsISO/IEC 9075-14:2008 [SQL14].
  2. Thenatural RDF lexical form is similar, but produces only the lexical form of the typed literal andrecommends that implementations perform XSD canonicalization. It is used in R2RML whennon-string columns are used in a string context, for example when aTIMESTAMP is used in an IRItemplate.
  3. Thecanonical RDF lexical form is again similar, butrequires XSD canonicalization. It is used in the Direct Mapping when IRIs are generated.
  4. Thedatatype-override RDF literal is a mapping that constructstyped literals by using thenatural RDF lexical form and applying a specifieddatatype IRI. The mapping author is responsible for ensuring that the generatedlexical form is valid for the datatype. It is used in R2RML when the target datatype of a literal-generating term map is overridden usingrr:datatype.

The mappings cover all predefined Core SQL 2008 datatypes exceptINTERVAL. The natural mappings may beextended with custom handling for other types, such as vendor-specific SQL datatypes. In the absence of such extensions, the natural mappings fall back on a simplecast to string for all unsupported SQL datatypes.

The mappings are referenced in the R2RMLterm generation rules.

An informativesummary of XSD lexical forms is provided to aid implementers.

10.2 Natural Mapping of SQL Values

Thenatural RDF literal corresponding to a SQL data value is the result of applying the following steps:

  1. Letdt be the SQL datatype of the SQL data value.
  2. Ifdt is a character string type (in Core SQL 2008:CHARACTER,CHARACTER VARYING,CHARACTER LARGE OBJECT,NATIONAL CHARACTER,NATIONAL CHARACTER VARYING,NATIONAL CHARACTER LARGE OBJECT), then the result is aplain literal withoutlanguage tag whoselexical form is the SQL data value.
  3. Otherwise, ifdt is listed in thetable below: The result is atyped literal whosedatatype IRI is the IRI indicated in theRDF datatype column in the same row asdt. Thelexical form may be any lexical form that represents the same value as the SQL data value, according to the definition of the RDF datatype. If there are multiple lexical forms available that represent the same value (e.g.,1,+1,1.0 and1.0E0), then the choice is implementation-dependent. However, the choiceMUST be made so that given a target RDF datatype and value, the same lexical form is chosen consistently (e.g.,INTEGER 5 andBIGINT 5 must be mapped to the same lexical form, as both are mapped to the RDF datatypexsd:integer and are equal values; mapping one to5 and the other to+5 would be an error). Thecanonical lexical representation [XMLSCHEMA2]MAY be chosen. (See also:Summary of XSD Lexical Forms)
  4. Otherwise, the result is aplain literal withoutlanguage tag whoselexical form is the SQL data valuecast to string.
SQL datatypeRDF datatypeLexical transformation (informative)
BINARY,BINARY VARYING,BINARY LARGE OBJECTxsd:hexBinaryxsd:hexBinary lexical mapping
NUMERIC,DECIMALxsd:decimalnone required
SMALLINT,INTEGER,BIGINTxsd:integernone required
FLOAT,REAL,DOUBLE PRECISIONxsd:doublenone required
BOOLEANxsd:booleanensure lowercase (true,false)
DATExsd:datenone required
TIMExsd:timenone required
TIMESTAMPxsd:dateTimereplace space character with “T
INTERVALundefinedundefined

R2RML extensions that handle vendor-specific or user-defined datatypes beyond those of Core SQL 2008 are expected to behave as if the table above contained additional rows that associate the SQL datatypes with appropriate RDF-compatible datatypes (e.g., theXML Schema built-in types [XMLSCHEMA2]), and appropriate lexical transformations where required. Note however that future versions of R2RML may also normatively add additional rows to this table.

The translation ofINTERVAL is left undefined due to the complexity of the translation. [SQL14] describes a translation ofINTERVAL toxdt:yearMonthDuration andxdt:dayTimeDuration.

In [SQL2], the precision of many SQL datatypes is not fixed, but left implementation-defined. Therefore, the mapping to XML Schema datatypes must rely on arbitrary-precision types such asxsd:decimal,xsd:integer andxsd:dateTime. Implementers of the mapping may wish to set upper limits for the supported precision of these XSD types. The XML Schema specification allows suchpartial implementations of infinite datatypes [XMLSCHEMA2], and defines specific minimum requirements.

Thenatural RDF datatype corresponding to a SQL datatype is the value of theRDF datatype column in the row corresponding to the SQL datatype in thetable above.

Thenatural RDF lexical form corresponding to a SQL data value is thelexical form of its correspondingnatural RDF literal, with the additional constraint that thecanonical lexical representation [XMLSCHEMA2]SHOULD be chosen.

Thecanonical RDF lexical form corresponding to a SQL data value is thelexical form of its correspondingnatural RDF literal, with the additional constraint that thecanonical lexical representation [XMLSCHEMA2]MUST be chosen.

Cast to string is an implementation-dependent function that maps SQL data values to equivalent Unicode strings. It is undefined for the following kinds of SQL datatypes: collection types, row types, user-defined types without a user-defined stringCAST, reference types whose referenced type does not have a user-defined stringCAST, binary types.

Cast to string is a fallback that handles vendor-specific and user-defined datatypes not supported by the R2RML processor. It can be implemented in a number of ways, including explicit SQL casts (“CAST(value AS VARCHAR(n))”, wheren is an arbitrary large integer), implicit SQL casts (concatenation with the empty string), or by employing a database access API that presents return values as strings.

10.3 Datatype-override Mapping of SQL Values

Thedatatype-override RDF literal corresponding to a SQL data valuev and adatatype IRIdt, is atyped literal whose lexical form is thenatural RDF lexical form corresponding tov, and whose datatype IRI isdt. If the typed literal isill-typed, then adata error is raised.

Atyped literal isill-typed in R2RML if its datatype IRI denotes avalidatable RDF datatype and itslexical form is not in thelexical space of the RDF datatype identified by itsdatatype IRI. (See also:Summary of XSD Lexical Forms)

The set ofvalidatable RDF datatypes includes all datatypes in theRDF datatype column of thetable of natural datatype mappings, as defined in [XMLSCHEMA2]. This setMAY include implementation-defined additional RDF datatypes.

For example,"X"^^xsd:boolean is ill-typed becausexsd:boolean is a validatable RDF datatype in R2RML, and “X” is not in thelexical space ofxsd:boolean [XMLSCHEMA2].

10.4 Non-String Columns in String Contexts

The same non-character-string SQL data value can typically be represented in multiple different string forms. For example, theDOUBLE value 1 can be represented as1,+1,1.0 and1.0E0. This can cause interoperability issues when such values are used in string contexts, for example when using them to generateIRIs. Two IRIs that are character-for-character equivalent, except one contains1 where the other contains1.0, will not “link up” in an RDF graph – they are two different nodes.

To reduce portability issues arising from such conversions, this specification recommends that implementations convert non-string data values to a canonical form (seenatural RDF lexical form). However, this is not a strict requirement. Therefore, when portability between R2RML implementations is a concern, mapping authorsSHOULD NOT use non-character-string columns in contexts where strings are produced:

In these contexts, if portability is to be maximized, then mapping authorsSHOULD use anR2RML view instead and explicitly convert the non-string column to a string column using an SQL expression.

Note that this is not a problem whennatural RDF literals are generated from such columns, because the resulting literal has a corresponding non-string XSD datatype, and equivalences between different lexical forms within these datatype are well-defined.

10.5 Summary of XSD Lexical Forms (Informative)

Thenatural mappings make reference to various XSD datatypes and require that SQL data values be converted to strings that are appropriate as lexical forms for these datatypes. This subsection gives examples of these lexical forms in order to aid implementers of the mappings. This subsection is non-normative; the normative definitions of the lexical spaces as well as the canonical lexical mappings are found inW3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes [XMLSCHEMA2].

A general approach that may be used for implementing the natural mappings is as follows:

  1. Identify the SQL datatype of the input SQL data value.
  2. Look up its correspondingnatural RDF datatype.
  3. Applycast to string to the SQL data value.
  4. Ensure that the resulting string is in the lexical space of the target RDF datatype; that is, it must be in a form such as those listed in either column of the table below. This may require some transformations of the string, in particular forxsd:hexBinary,xsd:dateTime andxsd:boolean.
  5. If the goal is to obtain a canonical lexical representation, then further string transformations may be required to obtain a form such as those listed in theCanonical lexical forms column of the table below.
RDF datatypeNon-canonical lexical formsCanonical lexical formsComments
xsd:hexBinary5232524d4c5232524D4CConvert from SQL by applyingxsd:hexBinary lexical mapping.
xsd:decimal.2240.224
+0011
42.042
-5.9000-5.9
xsd:integer-05-5
+333333
000
xsd:double-5.90-5.9E0Also supportsINF,-INF,NaN and-0.0E0,
but these do not appear in standard SQL.
+0.000147702150001.4770215E-4
+01E+31.0E3
100.01.0E2
00.0E0
xsd:boolean1trueMust be lowercase.
0false
xsd:date2011-08-23Dates in SQL don't have timezone offsets.
They are optional in XSD.
xsd:time22:17:34.885+00:0022:17:34.885ZMay or may not have timezone offset.
22:17:34.00022:17:34
22:17:34.1+01:0022:17:34.1+01:00
xsd:dateTime2011-08-23T22:17:00.000+00:002011-08-23T22:17:00ZMay or may not have timezone offset.
Convert from SQL by replacing space wiht “T”.

11 The Output Dataset

Theoutput dataset of an R2RML mapping is anRDF dataset that contains thegenerated RDF triples for each of thetriples maps of the R2RML mapping. The output datasetMUST NOT contain any otherRDF triples ornamed graphs besides these. However,R2RML processorsMAY provide access to datasets that contain additional triples or graphs beyond those in the output dataset, such as inferred triples or provenance information.

If a table or column is not explicitly referenced in atriples map, then noRDF triples will be generated for that table or column.

ConformingR2RML processorsMAY renameblank nodes when providing access to theoutput dataset. This means that client applications may see actualblank node identifiers that differ from those produced by theR2RML mapping. Client applicationsSHOULD NOT rely on the specific text of the blank node identifier for any purpose.

RDF syntaxes and RDF APIs generally representblank nodes withblank node identifiers. But the characters allowed in blank node identifiers differ between syntaxes, and not all characters occurring in the values produced by aterm map may be allowed, so a bijective mapping function from values to valid blank node identifiers may be required. The details of this mapping function are implementation-dependent, andR2RML processors may have to use different functions for different output syntaxes or access interfaces. Strings matching the regular expression[a-zA-Z_][a-zA-Z_0-9-]* are valid blank node identifiers in all W3C-recommended RDF syntaxes (as of this document's publication).

RDF datasets may contain emptynamed graphs. R2RML cannot generate such output datasets.

11.1 The Generated RDF Triples of a Triples Map

This subsection describes the process ofgenerating RDF triples from atriples map. This process addsRDF triples to theoutput dataset. Each generated triple is placed into one or more graphs of the output dataset.

The generated RDF triples are determined by the following algorithm. R2RML processorsMAY use other means than implementing this algorithm to compute the generated RDF triples, as long as the result is the same.

  1. Letsm be thesubject map of the triples map
  2. Letrows be the result of evaluating theeffective SQL query of thetriples map'slogical table using theSQL connection
  3. Letclasses be theclass IRIs ofsm
  4. Letsgm be the set ofgraph maps ofsm
  5. For eachlogical table rowrow inrows, apply the following steps:
    1. Letsubject be thegenerated RDF term that results from applyingsm torow
    2. Letsubject_graphs be the set of thegenerated RDF terms that result from applying each term map insgm torow
    3. For eachclass inclasses,add triples to the output dataset as follows:

      Subject:subject
      Predicate:rdf:type
      Object:class
      Target graphs: Ifsgm is empty:rr:defaultgraph; otherwise:subject_graphs

    4. For eachpredicate-object map of thetriples map, apply the following steps:
      1. Letpredicates be the set ofgenerated RDF terms that result from applying each of the predicate-object map'spredicate maps torow
      2. Letobjects be the set ofgenerated RDF terms that result from applying each of the predicate-object map'sobject maps (but notreferencing object maps) torow
      3. Letpogm be the set ofgraph maps of the predicate-object map
      4. Letpredicate-object_graphs be the set ofgenerated RDF terms that result from applying eachgraph map inpogm torow
      5. For each possible combination <predicate,object> wherepredicate is a member ofpredicates andobject is a member ofobjects,add triples to the output dataset as follows:

        Subject:subject
        Predicate:predicate
        Object:object
        Target graphs: Ifsgm andpogm are empty:rr:defaultGraph; otherwise: union ofsubject_graphs andpredicate-object_graphs

  6. For eachreferencing object map of apredicate-object map of thetriples map, apply the following steps:
    1. Letpsm be thesubject map of theparent triples map of the referencing object map
    2. Letpogm be the set ofgraph maps of the predicate-object map
    3. Letn be the number of columns in the logical table of thetriples map
    4. Letrows be the result of evaluating thejoint SQL query of the referencing object map
    5. For eachrow inrows, apply the following steps:
      1. Letchild_row be the logical table row derived by taking the firstn columns ofrow
      2. Letparent_row be the logical table row derived by taking all but the firstn columns ofrow
      3. Letsubject be thegenerated RDF term that results from applyingsm tochild_row
      4. Letpredicates be the set ofgenerated RDF terms that result from applying each of the predicate-object map'spredicate maps tochild_row
      5. Letobject be thegenerated RDF term that results from applyingpsm toparent_row
      6. Letsubject_graphs be the set ofgenerated RDF terms that result from applying eachgraph map ofsgm tochild_row
      7. Letpredicate-object_graphs be the set ofgenerated RDF terms that result from applying eachgraph map inpogm tochild_row
      8. For eachpredicate inpredicates,add triples to the output dataset as follows:

        Subject:subject
        Predicate:predicate
        Object:object
        Target graphs: If neithersgm norpogm has anygraph maps:rr:defaultGraph; otherwise: union ofsubject_graphs andpredicate-object_graphs

Add triples to the output dataset” is a process that takes the following inputs:

Execute the following steps:

  1. IfSubject,Predicate orObject isempty, then abort these steps.
  2. Otherwise, generate anRDF triple <Subject,Predicate,Object>
  3. If the set of target graphs includesrr:defaultGraph, add the triple to thedefault graph of theoutput dataset.
  4. For eachIRI in the set of target graphs that is not equal torr:defaultGraph, add the triple to anamed graph of that name in theoutput dataset. If the output dataset does not contain a named graph with that IRI, create it first.

RDF graphs cannot contain duplicateRDF triples. Placing multiple equal triples into the same graph has the same effect as placing it into the graph only once. Also note thescope of blank nodes.

11.2 The Generated RDF Term of a Term Map

Aterm map is a function that generates anRDF term from alogical table row. The result of that function can be:

Thegenerated RDF term of a term map for a given logical table row is determined as follows:

Theterm generation rules, applied to avalue, are as follows:

  1. Ifvalue isNULL, then no RDF term is generated.
  2. Otherwise, if theterm map'sterm type isrr:IRI:
    1. Letvalue be thenatural RDF lexical form corresponding tovalue.
    2. Ifvalue is a valid absolute IRI [RFC3987], then return anIRI generated fromvalue.
    3. Otherwise, prependvalue with thebase IRI. If the result is a valid absolute IRI [RFC3987], then return anIRI generated from the result.
    4. Otherwise, raise adata error.
  3. Otherwise, if the term type isrr:BlankNode:
    1. Return ablank node that is unique to thenatural RDF lexical form corresponding tovalue. (Note:On Blank Node Identifiers,Scope of Blank Nodes)
  4. Otherwise, if the term type isrr:Literal:
    1. If the term map has aspecified language tag, then return aplain literal with that language tag and with thenatural RDF lexical form corresponding tovalue.
    2. Otherwise, if the term map has a non-emptyspecified datatype that is different from thenatural RDF datatype corresponding to the term map'simplicit SQL datatype, then return thedatatype-override RDF literal corresponding tovalue and the specified datatype.
    3. Otherwise, return thenatural RDF literal corresponding tovalue.

A. RDF Terminology (Informative)

This appendix lists some terms normatively defined in other specifications.

The following terms are defined inRDF Concepts and Abstract Syntax [RDF] and used in R2RML:

The following terms are defined inSPARQL Query Language for RDF [SPARQL] and used in R2RML:

B. Index of R2RML Vocabulary Terms (Informative)

This appendix lists all the classes, properties and other terms defined by this specification within theR2RML vocabulary.

An RDFS representation of the vocabulary is available from thenamespace IRI.

B.1 Classes

The following table lists allR2RML classes.

The third column contains minimum conditions that a resource has to fulfil in order to be considered member of the class. Where multiple conditions are listed, all must be fulfilled.

ClassRepresentsMinimum conditions
rr:BaseTableOrViewSQL base table or viewHaving anrr:tableName property
rr:GraphMapgraph mapBeing anrr:TermMap
Being value of anrr:graphMap property
rr:Joinjoin conditionHaving anrr:parent property
Having anrr:child property
rr:LogicalTablelogical tableBeing one of its subclasses,rr:BaseTableOrView orrr:R2RMLView
rr:ObjectMapobject mapBeing anrr:TermMap
Being value of anrr:objectMap property
rr:PredicateMappredicate mapBeing anrr:TermMap
Being value of anrr:predicateMap property
rr:PredicateObjectMappredicate-object mapHaving at least one ofrr:predicate andrr:predicateMap
Having at least one ofrr:object andrr:objectMap
rr:R2RMLViewR2RML viewHaving anrr:sqlQuery property
rr:RefObjectMapreferencing object mapHaving anrr:parentTriplesMap property
rr:SubjectMapsubject mapBeing anrr:TermMap
Being value of anrr:subjectMap property
rr:TermMapterm mapHaving exactly one ofrr:constant,rr:column,rr:template
rr:TriplesMaptriples mapHaving anrr:logicalTable property
Having exactly one ofrr:subject andrr:subjectMap

Asnoted earlier, a single node in anR2RML mapping graph may represent multiple mapping components and thus be typed as several of these classes. However, the following classes are disjoint:

B.2 Properties

The following table lists all properties in theR2RML vocabulary.

The cardinality column indicates how often this property occurs within its context. Note that additional constraints not stated in this table might apply. The actual cardinality of some properties may depend on the presence or absence of other properties, and their values. Properties where this applies are indicated by an exclamation mark.

PropertyRepresentsContextCardinality
rr:childchild columnjoin condition1
rr:classclass IRIsubject map0…∞
rr:columncolumn namecolumn-valued term map1
rr:datatypespecified datatypeterm map0…1!
rr:constantconstant valueconstant-valued term map1
rr:graphconstant shortcut propertysubject map,predicate-object map0…∞
rr:graphMapgraph map
rr:inverseExpressioninverse expressionterm map0…1!
rr:joinConditionjoin conditionreferencing object map0…∞
rr:languagespecified language tagterm map0…1!
rr:logicalTablelogical tabletriples map1
rr:objectconstant shortcut propertypredicate-object map1…∞
rr:objectMapobject map,referencing object map
rr:parentparent columnjoin condition1
rr:parentTriplesMapparent triples mapreferencing object map1
rr:predicateconstant shortcut propertypredicate-object map1…∞
rr:predicateMappredicate map
rr:predicateObjectMappredicate-object maptriples map0…∞
rr:sqlQuerySQL queryR2RML view1
rr:sqlVersionSQL version identifierR2RML view0…∞
rr:subjectconstant shortcut propertytriples map0…1
rr:subjectMapsubject map
rr:tableNametable or view nameSQL base table or view1
rr:templatestring templatetemplate-valued term map1
rr:termTypeterm typeterm map0…1!

B.3 Other Terms

TermDenotesUsed with property
rr:defaultGraphdefault graphrr:graph
rr:SQL2008Core SQL 2008rr:sqlVersion
rr:IRIIRIrr:termType
rr:BlankNodeblank noderr:termType
rr:Literalliteralrr:termType

C. References

C.1 Normative References

[DM]
A Direct Mapping of Relational Data to RDF, Alexandre Bertails, Marcelo Arenas, Eric Prud'hommeaux, Juan Sequeda, Editors. World Wide Web Consortium, 27 September 2012. This version is http://www.w3.org/TR/2012/REC-rdb-direct-mapping-20120927/. The latest version is http://www.w3.org/TR/rdb-direct-mapping/.
[RDF]
Resource Description Framework (RDF): Concepts and Abstract Syntax, Graham Klyne, Jermey J. Carroll, Editors. World Wide Web Consortium, 10 February 2004. This version is http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/. The latest version is http://www.w3.org/TR/rdf-concepts/.
[RFC2119]
Key words for use in RFCs to Indicate Requirement Levels, S. Bradner, March 1997. Internet RFC 2119, http://tools.ietf.org/html/rfc2119.
[RFC3629]
UTF-8, a transformation format of ISO 10646, F. Yergeau. November 2003. Internet RFC 3629, http://tools.ietf.org/html/rfc3629.
[RFC3986]
Uniform Resource Identifier (URI): Generic Syntax, T. Berners-Lee, R. Fielding, L. Masinter. January 2005. Internet RFC 3986, http://tools.ietf.org/html/rfc3986.
[RFC3987]
Internationalized Resource Identifiers (IRIs), M. Duerst, M. Suignard. January 2005. Internet RFC 3987, http://tools.ietf.org/html/rfc3987.
[SPARQL]
SPARQL Query Language for RDF, Eric Prud'hommeaux, Andy Seaborne, Editors. World Wide Web Consortium, 15 January 2008. This version is http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/. The latest version is http://www.w3.org/TR/rdf-sparql-query/.
[SQL1]
ISO/IEC 9075-1:2008 SQL - Part 1: Framework (SQL/Framework). International Organization for Standardization, 27 January 2009.
[SQL2]
ISO/IEC 9075-2:2008 SQL - Part 2: Foundation (SQL/Foundation). International Organization for Standardization, 27 January 2009.
[TURTLE]
Turtle - Terse RDF Triple Language, Eric Prud'hommeaux, Gavin Carothers. World Wide Web Consortium, 10 July 2012. This version is http://www.w3.org/TR/2012/WD-turtle-20120710/. The latest version is http://www.w3.org/TR/turtle/. This document is work in progress.
[XMLSCHEMA2]
W3C XML Schema Definition Language (XSD) 1.1 Part 2: Datatypes, David Peterson, Shudi Gao, Ashok Malhotra, C. M. Sperberg-McQueen, Henry S. Thompson. World Wide Web Consortium, 5 April 2012. This version is http://www.w3.org/TR/2012/REC-xmlschema11-2-20120405/. The latest version is http://www.w3.org/TR/xmlschema11-2/.

C.2 Other References

[SQL14]
ISO/IEC 9075-14:2008 SQL - Part 14: XML-Related Specifications (SQL/XML). International Organization for Standardization, 27 January 2009.
[SQLIRIS]
SQL Version IRIs, Editors of the W3C Semantic Web Standards wiki. The latest version is http://www.w3.org/2001/sw/wiki/RDB2RDF/SQL_Version_IRIs. This is a public wiki page.
[TC]
R2RML and Direct Mapping Test Cases, Boris Villazón-Terrazas, Michael Hausenblas, Editors. World Wide Web Consortium, 14 August 2012. This version is http://www.w3.org/TR/2012/NOTE-rdb2rdf-test-cases-20120814/. The latest version is http://www.w3.org/TR/rdb2rdf-test-cases/.
[UCNR]
Use Cases and Requirements for Mapping Relational Databases to RDF, Eric Prud'hommeaux, Michael Hausenblas, Editors. World Wide Web Consortium, 8 June 2010. This version is http://www.w3.org/TR/2010/WD-rdb2rdf-ucr-20100608/. The latest version is http://www.w3.org/TR/rdb2rdf-ucr/. This document is work in progress.

D. Acknowledgements (Informative)

The Editors would like to give special thanks to the following contributors: David McNeil greatly improved the quality of the specification with detailed reviews and comments. Nuno Lopes and Eric Prud'hommeaux contributed to the design of the mapping from SQL data values to RDF literals. Eric also worked on the mechanism for SQL compatibility. Boris Villazón-Terrazas drew the diagrams throughout the text, and kept them up-to-date throughout many iterations.

In addition, the Editors gratefully acknowledge contributions from: Marcelo Arenas, Sören Auer, Samir Batla, Alexander de Leon, Orri Erling, Lee Feigenbaum, Enrico Franconi, Howard Greenblatt, Wolfgang Halb, Harry Halpin, Michael Hausenblas, Patrick Hayes, Ivan Herman, Nophadol Jekjantuk, Li Ma, Nan Ma, Ashok Malhotra, Ivan Mikhailov, Percy Enrique Rivera Salas, Juan Sequeda, Ben Szekely, Ted Thibodeau, and Edward Thomas.


[8]ページ先頭

©2009-2025 Movatter.jp