Movatterモバイル変換

[0]ホーム

Jump to content

SPARQL

Edit links

From Wikipedia, the free encyclopedia

RDF query language

This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "SPARQL" – news ·newspapers ·books ·scholar ·JSTOR(March 2013) (Learn how and when to remove this message)

SPARQL
Paradigm	Query language
Developer	W3C
First appeared	15 January 2008; 18 years ago (2008-01-15)

Stable release	1.1 / 21 March 2013; 12 years ago (2013-03-21)

Website	www.w3.org/TR/sparql11-query/
Majorimplementations
Apache Jena,^[1]OpenLink Virtuoso^[1]

SPARQL (pronounced "sparkle", arecursive acronym^[2] forSPARQL Protocol and RDF Query Language) is anRDF query language—that is, asemantic query language fordatabases—able to retrieve and manipulate data stored inResource Description Framework (RDF) format.^[3]^[4] It was made a standard by theRDF Data Access Working Group (DAWG) of theWorld Wide Web Consortium, and is recognized as one of the key technologies of thesemantic web. On 15 January 2008, SPARQL 1.0 was acknowledged byW3C as an official recommendation,^[5]^[6] and SPARQL 1.1 in March, 2013.^[7]

The Wikidata Query Service can be used to query data fromWikidata using SPARQL^[8]^[9]

SPARQL allows for a query to consist oftriple patterns,conjunctions,disjunctions, and optionalpatterns.^[10]

Implementations for multipleprogramming languages exist.^[11] There exist tools that allow one to connect and semi-automatically construct a SPARQL query for a SPARQL endpoint, for example ViziQuer.^[12]In addition, tools exist to translate SPARQL queries to other query languages, for example toSQL^[13] and toXQuery.^[14]

Features

[edit]

SPARQL allows users to write queries that follow theRDF specification of theW3C. Thus, the entire dataset is "subject-predicate-object" triples. Subjects and predicates are always URI identifiers, but objects can be URIs or literal values. This single physical schema of 3 "columns" is hypernormalized in that what would be 1 relational record with (for example) 4 columns is now 4 triples with the subject being repeated over and over, the predicate essentially being the column name, and the object being the column value. Although this seems unwieldy, the SPARQL syntax offers these features:

1. Subjects and Objects can be used to find the other including transitively.

Below is a set of triples. It should be clear thatex:sw001 andex:sw002 link toex:sw003, which itself has links:

ex:sw001ex:linksWithex:sw003.ex:sw002ex:linksWithex:sw003.ex:sw003ex:linksWithex:sw004,ex:sw006.ex:sw004ex:linksWithex:sw005.

In SPARQL, the first time a variable is encountered in the expression pipeline, it is populated with result. The second and subsequent times it is seen, it is used as an input. If we assign ("bind") the URIex:sw003 to the?targets variable, then it drives aresult into?src; this tells us all the things that linktoex:sw003 (upstream dependency):

SELECT*WHERE{BIND(ex:sw003AS?targets)?srcex:linksWith?targets.# ?src populated with ex:sw001, ex:sw002}

But with a simple switch of the binding variable, the behavior is reversed. This will produce all the things upon whichex:sw003 depends (downstream dependency):

SELECT*WHERE{BIND(ex:sw003AS?src)?srcex:linksWith?targets.# NOTICE!  No syntax change! ?targets populated with ex:sw004, ex:sw006}

Even more attractive is that we can easily instruct SPARQL to transitively follow the path:

SELECT*WHERE{BIND(ex:sw003AS?src)# Note the +; now SPARQL will also find ex:sw005 transitively via ex:sw004; ?targets is ex:sw004, ex:sw005, ex:sw006?srcex:linksWith+?targets.}

Bound variables can therefore also be lists and will be operated upon without complicated syntax. The effect of this is similar to the followingpseudocode:

If?Sisboundto(ex:A,ex:B)and?OisUNboundthen?Sex:linksWith?Obehaveslikeaforwardchain:foreachsin?S:foreachfetch(s,ex:linksWith):captureoappendoto?OIf?Oisboundto(ex:A,ex:B)and?SisUNboundthen?Sex:linksWith?Obehaveslikeabackwardchain:foreachoin?O:foreachfetch(ex:linksWith,o):capturesappendsto?S

2. SPARQL expressions are a pipeline

Unlike SQL which has subqueries and CTEs, SPARQL is much more like MongoDB or SPARK. Expressions are evaluated exactly in the order they are declared including filtering and joining of data. The programming model becomes what a SQL statement would be like with multiple WHERE clauses. The combination of list-aware subjects and objects plus a pipeline approach can yield extremely expressive queries spanning many different domains of data. JOIN as used in RDBMS and understanding the dynamics of the JOIN (e.g. what column in what table is suitable to join to another, inner vs. outer, etc.) is not relevant in SPARQL (and in some ways simpler) because objects, if an URI and not a literal, implicitly can be usedonly to find a subject. Here is a more comprehensive example that illustrates the pipeline using some syntax shortcuts.

#  SELECT only the terminal values we need.  If we did SELECT * (which#  is not nessarily bad), then "intermediate" variables ?vendor and ?owner#  would be part of the output.SELECT?slbl?vlbl?lei?lnameWHERE{# ?sw is unbound.  Set predicate to rdf:type and object to ex:Software# and collect all software instances.  At same, pull the software# label (a terse description) and populate ?slbl and also capture the# vendor object into ?vendor.?swrdf:typeex:Software;rdfs:label?slbl;ex:vendor?vendor.# The above in "longhand" reveals the binding process:#  ?sw rdf:type  ex:Software .  # ?sw UNBOUND; is set here#  ?sw rdfs:label ?slbl . # ?sw bound; set unbound ?slbl#  ?sw ex:vendor ?vendor .  # ?sw still bound; set ?vendor# Exclude open source software.  Note ex:oss is an URI because it is# an UNquoted string:FILTER(?vendorNOT IN(ex:oss))# Next, dive into ?vendor object and extract legal entity identifier# and owner of the data -- where owner is also an object.  ?vendor is# bound; ?vlbl, ?lei, and ?owner are unbound and will be populated:?vendorrdfs:label?vlbl;ex:LEI?lei;ex:owner?owner.#  Lastly, from owner object, capture last name:?ownerex:lastname?lname.}

Unlike relational databases, the object column is heterogeneous: the object data type, if not an URI, is usually implied (or specified in theontology) by thepredicate value. Literal nodes carry type information consistent with the underlying XSD namespace including signed and unsigned short and long integers, single and double precision floats, datetime, penny-precise decimal, Boolean, and string. Triple store implementations on traditional relational databases will typically store the value as a string and a fourth column will identify the real type. Polymorphic databases such as MongoDB and SQLite can store the native value directly into the object field.

Thus, SPARQL provides a full set of analytic query operations such asJOIN,SORT,AGGREGATE for data whoseschema is intrinsically part of the data rather than requiring a separate schema definition. However, schema information (the ontology) is often provided externally, to allow joining of differentdatasets unambiguously. In addition, SPARQL provides specificgraph traversal syntax for data that can be thought of as a graph.

The example below demonstrates a simple query that leverages theontology definitionfoaf ("friend of a friend").

Specifically, the following query returns names and emails of every person in thedataset:

PREFIXfoaf:<http://xmlns.com/foaf/0.1/>SELECT?name?emailWHERE{?personafoaf:Person.?personfoaf:name?name.?personfoaf:mbox?email.}

This query joins all of the triples with a matching subject, where the type predicate, "a", is a person (foaf:Person), and the person has one or more names (foaf:name) and mailboxes (foaf:mbox).

For the sake of readability, the author of this query chose to reference the subject using the variable name "?person". Since the first element of the triple is always the subject, the author could have just as easily used any variable name, such as "?subj" or "?x". Whatever name is chosen, it must be the same on each line of the query to signify that the query engine is to join triples with the same subject.

The result of the join is a set of rows –?person,?name,?email. This query returns the?name and?email because?person is often a complex URI rather than a human-friendly string. Note that any?person may have multiple mailboxes, so in the returned set, a?name row may appear multiple times, once for each mailbox, duplicating the?name.

An important consideration in SPARQL is that when lookup conditions are not met in the pipeline for terminal entities like?email, then thewhole row is excluded, unlike SQL where typically a null column is returned. The query above will return only those?person where both at least one?name and at least one?email can be found. If a?person had no email, they would be excluded. To align the output with that expected from an equivalent SQL query, theOPTIONAL keyword is required:

PREFIXfoaf:<http://xmlns.com/foaf/0.1/>SELECT?name?emailWHERE{?personafoaf:Person.OPTIONAL{?personfoaf:name?name.?personfoaf:mbox?email.}}

This query can be distributed to multiple SPARQL endpoints (services that accept SPARQL queries and return results), computed, and results gathered, a procedure known asfederated query.

Whether in a federated manner or locally, additional triple definitions in the query could allow joins to different subject types, such as automobiles, to allow simple queries, for example, to return a list of names and emails for people who drive automobiles with a high fuel efficiency.

Query forms

[edit]

In the case of queries that read data from the database, the SPARQL language specifies four different query variations for different purposes.

SELECT query: Used to extract raw values from a SPARQL endpoint, the results are returned in a table format.
CONSTRUCT query: Used to extract information from the SPARQL endpoint and transform the results into valid RDF.
ASK query: Used to provide a simple True/False result for a query on a SPARQL endpoint.
DESCRIBE query: Used to extract an RDF graph from the SPARQL endpoint, the content of which is left to the endpoint to decide, based on what the maintainer deems as useful information.

Each of these query forms takes aWHERE block to restrict the query, although, in the case of theDESCRIBE query, theWHERE is optional.

SPARQL 1.1 specifies a language for updating the database with several new query forms.^[15]

Example

[edit]

Another SPARQL query example that models the question "What are all the country capitals in Africa?":

PREFIXex:<http://example.com/exampleOntology#>SELECT?capital?countryWHERE{?xex:cityname?capital;ex:isCapitalOf?y.?yex:countryname?country;ex:isInContinentex:Africa.}

Variables are indicated by a? or$ prefix. Bindings for?capital and the?country will be returned. When a triple ends with a semicolon, the subject from this triple will implicitly complete the following pair to an entire triple. So for exampleex:isCapitalOf ?y is short for?x ex:isCapitalOf ?y.

The SPARQL query processor will search for sets of triples that match these four triple patterns, binding the variables in the query to the corresponding parts of each triple. Important to note here is the "property orientation" (class matches can be conducted solely through class-attributes or properties – seeDuck typing).

To make queries concise, SPARQL allows the definition of prefixes and baseURIs in a fashion similar toTurtle. In this query, the prefix "ex" stands for “http://example.com/exampleOntology#”.

SPARQL has native dateTime operations as well. Here is a query that will return all pieces of software where the EOL date is greater than or equal to 1000 days from the release date and the release year is 2020 or greater:

SELECT?lbl?version?released?eol?durationWHERE{?softwareaex:Software;rdfs:label?lbl;ex:EOL?eol;# is xsd:dateTimeex:version?version;# stringex:released?released;# is xsd:dateTime# After this stage,?duration is bound as xsd:duration type# (in Java implementations, org.apache.jena.datatypes.xsd.XSDDuration)# and is available in the pipeline,in the SELECT, and in# GROUP or ORDER operators,etc.:BIND(?eol-?releasedAS?duration)# toString representation of Duration is of format PnYnMnDTnHnMnS.# We must use ^^ casting to tell the engine this is to be treated as a duration.# SPARQL (and RDF) literal syntax has built-in numeric shortcuts to simplify# expressions without casts:#   16         xsd:int       java.lang.Integer#   16.7       xsd:decimal   java.math.BigDecimal   preserves precision#   16.700     xsd:decimal   java.math.BigDecimal   preserves precision#   1.0632e6   xsd:double    java.lang.Double   true double float; be careful#   2147483649 xsd:long      java.lang.Long  >32 bit int automatically detected## Most castings work as expected e.g. "16.700"^^xsd:double.# Note in the FILTER below we use the shortcut for integer 2020:FILTER(?duration>="P1000D"^^xsd:duration&&YEAR(?released)>=2020)}ORDER BYDESC(?duration)LIMIT5

Extensions

[edit]

GeoSPARQL defines filter functions forgeographic information system (GIS) queries using well-understood OGC standards (GML,WKT, etc.).

SPARUL is another extension to SPARQL. It enables the RDF store to be updated with this declarative query language, by addingINSERT andDELETE methods.

XSPARQL is an integrated query language combiningXQuery with SPARQL to query both XML and RDF data sources at once.^[16]

Implementations

[edit]

Main article:List of SPARQL implementations

Open source, reference SPARQL implementations

SeeList of SPARQL implementations for more comprehensive coverage, includingtriplestore,APIs, and other storages that have implemented the SPARQL standard.

References

[edit]

^^a ^b ^c ^dHebeler, John; Fisher, Matthew; Blace, Ryan; Perez-Lopez, Andrew (2009).Semantic Web Programming.Indianapolis:John Wiley & Sons, Inc. p. 406.ISBN 978-0-470-41801-7.
^Beckett, Dave (6 October 2011)."What does SPARQL stand for?".semantic-web@w3.org.
^Jim Rapoza (2 May 2006)."SPARQL Will Make the Web Shine".eWeek. Retrieved17 January 2007.
^Segaran, Toby; Evans, Colin; Taylor, Jamie (2009).Programming the Semantic Web. O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. p. 84.ISBN 978-0-596-15381-6.
^"W3C Semantic Web Activity News – SPARQL is a Recommendation". W3.org. 15 January 2008. Archived fromthe original on 20 January 2008. Retrieved1 October 2009.
^"XML and Semantic Web W3C Standards Timeline"(PDF). 4 February 2012. Retrieved27 November 2013.
^Herman, Ivan (21 March 2013)."Eleven SPARQL 1.1 Specifications are W3C Recommendations". W3C blog. Retrieved4 October 2025.
^"SPINACH: SPARQL-Based Information Navigation for Challenging Real-World Questions". Retrieved15 April 2025.
^"Wikidata Query Service - Wikitech".wikitech.wikimedia.org. Retrieved15 April 2025.
^"XML and Web Services in the News".xml.org. 6 October 2006. Retrieved17 January 2007.
^"SparqlImplementations – ESW Wiki". Esw.w3.org. Retrieved1 October 2009.
^"ViziQuer a tool to construct SPARQL queries automatically". lumii.lv. Retrieved25 February 2011.
^"D2R Server". Retrieved4 February 2012.
^"SPARQL2XQuery Framework". Retrieved4 February 2012.
^Yu, Liyang (2014).A Developer's Guide to the Semantic Web. Springer. p. 308.ISBN 978-3-662-43796-4.
^"XSPARQL published as a W3C Submission". W3.org. 23 June 2009. Retrieved22 May 2022.

External links

[edit]

Wikimedia Commons has media related toSPARQL.

Wikidata Query Service; example SPARQL queries arehere
Wikidata Query Service Tutorial
DBpedia
W3C Data Activity Blog
W3C SPARQL 1.1 Working Group - closed - mailing lists and archives, was RDF Data Access Working Group
SPARQL 1.1 Recommendation
SPARQL 1.0 Query language (legacy)
SPARQL 1.0 Protocol (legacy)
SPARQL 1.0 Query XML Results Format (legacy)
SPARQL2XQuery Mappings between OWL-RDF/S & XML Schemas, and XML Schema to OWL Transformation.
SPARQL Syntax Expressions in the ARQ query engine
James (8 September 2011)."DAWG Test Suite for SPOCQ".Dydra. Archived fromthe original on 7 June 2015. Retrieved2 December 2014.
James (8 September 2011)."RSpec Code Examples / Results: 425 examples, 1 failure / Finished in 287.385157145 seconds".Dydra. Archived fromthe original on 11 December 2011. Retrieved2 December 2014.

Semantic Web

Background

Sub-topics

Applications

v t e Query languages
In current use	.QL ALPHA CQL Cypher DAX DMX Datalog GraphQL Graph Query Language Gremlin ISBL LDAP LINQ MQL MDX OQL OCL QUEL RDF SMARTS SPARQL SQL XQuery XPath YQL
Proprietary	YQL LINQ
Superseded	CODASYL

Authority control databases
International	GND FAST
National	United States Israel

Movatterモバイル変換

Features

Query forms

Example

Extensions

Implementations

See also

References

External links