Please refer to theerrata for this document, which may include some normative corrections.
The English version of this specification is the only normative version. Non-normativetranslations may also be available.
Copyright © 2012-2013W3C® (MIT,ERCIM,Keio,Beihang), All Rights Reserved.W3Cliability,trademark anddocument use rules apply.
Provenance is information about entities, activities, and peopleinvolved in producing a piece of data or thing, which can be used to form assessments about its quality, reliability or trustworthiness.PROV-DM is the conceptual data model that forms a basis for theW3Cprovenance (PROV) family of specifications.PROV-DM distinguishes core structures, forming the essence of provenance information, fromextended structures catering for more specific uses of provenance. PROV-DM is organized in six components, respectively dealing with: (1) entities and activities, and the time at which they were created, used, or ended;(2) derivations of entities from entities;(3) agents bearing responsibility for entities that were generated and activities that happened;(4) a notion of bundle, a mechanism to support provenance of provenance; and,(5) properties to link entities that refer to the same thing;(6) collections forming a logical structure for its members.
To provide examples of the PROV data model, the PROV notation (PROV-N) is introduced: aimed at human consumption, PROV-N allows serializations of PROVinstances to be created in a compact manner. PROV-N facilitates themapping of the PROV data model to concrete syntax, and is used as the basis for aformal semantics of PROV. The purpose of this document is to define the PROV-N notation.
ThePROV Document Overview describes the overall state of PROV, and should be read before other PROV documents.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of currentW3C publications and the latest revision of this technical report can be found in theW3C technical reports index at http://www.w3.org/TR/.
This document has been reviewed byW3C Members, by software developers, and by otherW3C groups and interested parties, and is endorsed by the Director as aW3C Recommendation. It is a stable document and may be used as reference material or cited from another document.W3C's role in making the Recommendation is to draw attention to the specification and to promote its widespread deployment. This enhances the functionality and interoperability of the Web.
This document was published by theProvenance Working Group as a Recommendation. If you wish to make comments regarding this document, please send them topublic-prov-comments@w3.org (subscribe,archives). All comments are welcome.
This document was produced by a group operating under the5 February 2004W3C Patent Policy.W3C maintains apublic list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes containsEssential Claim(s) must disclose the information in accordance withsection 6 of theW3C Patent Policy.
Provenance is a record that describes the people,institutions, entities, and activities, involved in producing,influencing, or delivering a piece of data or a thing in the world. Twocompanion specifications respectively define PROV-DM, a data model forprovenance, allowing provenance descriptions to be expressed [PROV-DM] and a set of constraints that provenance descriptions are expected to satisfy [PROV-CONSTRAINTS].
This document introduces the PROV-N grammar along with examples of its usage.
Its target audience is twofold:
document
.expression
nonterminal a useful entry point into the grammar.For the purpose of compliance, all sections of this document are normative, exceptAppendix A,Appendix B, andAppendix C.2.
This document is structured as follows.
Section 2 provides general consideration about the PROV-N grammar.
Section 3 presents the grammar of all expressions of the language grouped according to the PROV data model components.
Section 4 defines the grammar of document, a house-keeping construct of PROV-N capable of packaging up PROV-N expressions and namespace declarations.
Section 5 defines the extensibility mechanism for the PROV-N notation.
Section 6 defines media type for the PROV-N notation.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].
The following namespaces prefixes are used throughout this document.
prefix | namespace uri | definition |
prov | http://www.w3.org/ns/prov# | The PROV namespace (see Section3.7.4) |
xsd | http://www.w3.org/2000/10/XMLSchema# | XML Schema Namespace [XMLSCHEMA11-2] |
(others) | (various) | All other namespace prefixes are used in examples only. In particular, URIs starting with "http://example.com" represent some application-dependent URI [RFC3986] |
For convenience, all productions presented in this document have been grouped in aseparatefile.
PROV-N adopts a functional-style syntax consisting of a predicate name and an ordered list of terms.
All PROV data model types have an identifier. Furthermore, some expressions also admit additional elements that further characterize it.
The following expression should be read as "activitya2, which occurredbetween2011-11-16T16:00:00
and2011-11-16T16:00:01
".
entity(e1)activity(a2, 2011-11-16T16:00:00, 2011-11-16T16:00:01)
All PROV data model relations involve two primary elements, thesubject and theobject, in this order. Furthermore, some expressions also admit additional elements that further characterize it.
The following expression should be read as "e2 was derived frome1". Heree2 is the subject, ande1 is the object.
wasDerivedFrom(e2, e1)
The following expression expands the above derivation relation by providingadditional elements the optional activitya, the generationg2, and the usageu1:
wasDerivedFrom(e2, e1, a, g2, u1)
The grammar is specified using a subset of the Extended Backus-NaurForm (EBNF) notation, as defined in Extensible Markup Language (XML) 1.1[XML11] section6Notation.
The text below provides an introduction to the EBNF notation used inthis document.
EBNF specifies a series of production rules (production). A production rule in the grammar defines a symbolexpr
(nonterminal symbol) using the following form:
expr
::=termSymbols are written with an initial capital letter if they are the start symbol of a regular language, otherwise with an initial lowercase letter. A production rule in the grammar defines a symbol<TERMINAL>
(terminal symbol) using the following form:
<TERMINAL>
::=termWithin the term on the right-hand side of a rule, the followingterms are used to match strings of one or more characters:
expr
: matches production for nonterminal symbolexpr
TERMINAL
: matches production for terminal symbolTERMINAL
"abc"
: matches the literal string inside the single quotes.(term)?
: optional, matchesterm or nothing.(term)+
: matches one or more occurrences ofterm.(term)*
: matches zero or more occurrences ofterm.(term | term)
: matches one of the twoterms.Where suitable, the PROV-N grammar reuses production and terminal names of the SPARQL grammar [RDF-SPARQL-QUERY].
Two productions are entry points to the grammar.
The productionexpression
provides the structure for thecore expressions of PROV-N.
Each of the symbols included inexpression
above, i.e.,entityExpression
,activityExpression
etc., corresponds to one concept (e.g., Entity, Activity, etc.) of the PROV data model.
Alternatively, the production ruledocument
provides the overall structure of PROV-N descriptions. It is a wrapper for a set of expressions, such that the text for an element matches the correspondingexpression
production, and some namespace declarations.
wasDerivedFrom(e2, e1, a, g2, u1)wasDerivedFrom(e2, e1)In a derivation expression, the activity, generation, and usage are optional terms. They are specified in the first derivation, but not in the second.
activity(a2, 2011-11-16T16:00:00, 2011-11-16T16:00:01)activity(a1)The start and end times for an activity are optional. They are specified in the first expression, but not in the second.
The general rule for optionals is that, ifnone of the optionals are used in the expression, then they are simply omitted, resulting in a simpler expression as in the examples above.
However, it may be the case that only some of the optional terms are omitted. Because the position of the terms in the expression matters, an additional marker must be used to indicate that a particular term is not available. The symbol'-' is used for this purpose.In the first expression below, all optionals are specified. However in the second and third, only one optional is specified, forcing the use of the marker for the missing terms.
wasDerivedFrom(e2, e1, a, g2, u1)wasDerivedFrom(e2, e1, -, -, u1)wasDerivedFrom(e2, e1, a, -, -)
activity(a1)activity(a1, -, -)
Almost all expressions defined in the grammar include an identifier (seeSection 3.7.1 for the full syntax of identifiers). Most expressionscan also include a set of attribute-value pairs, delimited by square brackets. Identifiers are optional except for Entities, Activities, and Agents. Identifiers are always the first term in any expression.Optional identifiersMUST be separated using a semi-colon ';', but where the identifiers are required, a regular comma ','MUST be used. This makes it possible to completely omit an optional identifier with no ambiguity arising. Also, if the set of attribute-value pairs is present, it is always the last term in any expression.
Derivation has an optional identifier. In the first expression, the identifier is not available, while it is explicit in the second. The third example shows that one can optionally indicate the missing identifier using the- marker. This is equivalent to the first expression.
wasDerivedFrom(e2, e1)wasDerivedFrom(d; e2, e1)wasDerivedFrom(-; e2, e1)
The first and second activity expressions do not specify any attributes, and are equivalent.The third activity expression specifies two attributes.
activity(ex:a1)activity(ex:a1, [])activity(ex:a1, [ex:param1="a", ex:param2="b"])
IRI_REF
orSTRING_LITERAL
; such comments continue to the end of line (marked by characters U+000D or U+000A) or end of file if there is no end of line after the comment marker.IRI_REF
orSTRING_LITERAL
.Comments are treated as white space.
This section introduces grammar productions for each expression, followed by small examples of expressions illustrating the grammar. Strings conforming to the grammar are valid expressions in the PROV-N language.
[3] | entityExpression | ::= | "entity" "("identifieroptionalAttributeValuePairs ")" |
[4] | optionalAttributeValuePairs | ::= | ( "," "["attributeValuePairs "]" )? |
[5] | attributeValuePairs | ::= | ( |attributeValuePair ( ","attributeValuePair )* ) |
[6] | attributeValuePair | ::= | attribute "="literal |
The following table summarizes how each constituent of a PROV-DM Entity maps to a PROV-N syntax element.
entity(tr:WD-prov-dm-20111215, [ prov:type="document" ])Heretr:WD-prov-dm-20111215 is the entity identifier, and[ prov:type="document" ] groups the optional attributes, only one in this example, with their values.
entity(tr:WD-prov-dm-20111215)Here, the optional attributes are absent.
[7] | activityExpression | ::= | "activity" "("identifier ( ","timeOrMarker ","timeOrMarker )?optionalAttributeValuePairs ")" |
[8] | timeOrMarker | ::= | (time | "-" ) |
The following table summarizes how each constituent of a PROV-DM Activity maps to a PROV-N syntax element.
Activity | Non-Terminal |
id | identifier |
startTime | timeOrMarker |
endTime | timeOrMarker |
attributes | optionalAttributeValuePairs |
activity(ex:a10, 2011-11-16T16:00:00, 2011-11-16T16:00:01, [prov:type="createFile"])
Hereex:a10 is the activity identifier,2011-11-16T16:00:00 and2011-11-16T16:00:01 are the optional start and end times for the activity, and[prov:type="createFile"] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.activity(ex:a10)activity(ex:a10, -, -)activity(ex:a10, -, -, [prov:type="edit"])activity(ex:a10, -, 2011-11-16T16:00:00)activity(ex:a10, 2011-11-16T16:00:00, -)activity(ex:a10, 2011-11-16T16:00:00, -, [prov:type="createFile"])activity(ex:a10, [prov:type="edit"])
[9] | generationExpression | ::= | "wasGeneratedBy" "("optionalIdentifiereIdentifier ( ","aIdentifierOrMarker ","timeOrMarker )?optionalAttributeValuePairs ")" |
[10] | optionalIdentifier | ::= | (identifierOrMarker ";" )? |
[11] | identifierOrMarker | ::= | (identifier | "-" ) |
The following table summarizes how each constituent of a PROV-DM Generation maps to a PROV-N syntax element.
Generation | Non-Terminal |
id | optionalIdentifier |
entity | eIdentifier |
activity | aIdentifierOrMarker |
time | timeOrMarker |
attributes | optionalAttributeValuePairs |
wasGeneratedBy(ex:g1; tr:WD-prov-dm-20111215, ex:edit1, 2011-11-16T16:00:00, [ex:fct="save"])
Hereex:g1 is the optional generation identifier,tr:WD-prov-dm-20111215 is the identifier of the entity being generated,ex:edit1 is the optional identifier of the generating activity,2011-11-16T16:00:00 is the optional generation time, and [ex:fct="save"] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasGeneratedBy(e2, a1, -)wasGeneratedBy(e2, a1, 2011-11-16T16:00:00)wasGeneratedBy(e2, a1, -, [ex:fct="save"]) wasGeneratedBy(e2, [ex:fct="save"]) wasGeneratedBy(ex:g1; e)wasGeneratedBy(ex:g1; e, a, tr:WD-prov-dm-20111215)
Additional semantic rules (Section 3.7.5) apply togenerationExpression
.
[12] | usageExpression | ::= | "used" "("optionalIdentifieraIdentifier ( ","eIdentifierOrMarker ","timeOrMarker )?optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Usage maps to a PROV-N syntax element.
Usage | Non-Terminal |
id | optionalIdentifier |
activity | aIdentifier |
entity | eIdentifierOrMarker |
time | timeOrMarker |
attributes | optionalAttributeValuePairs |
used(ex:u1; ex:act2, ar3:0111, 2011-11-16T16:00:00, [ex:fct="load"])
Hereex:u1 is the optional usage identifier,ex:act2 is the identifier of the using activity,ar3:0111 is the identifier of the entity being used,2011-11-16T16:00:00 is the optional usage time, and [ex:fct="load"] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.used(ex:act2)used(ex:act2, ar3:0111, 2011-11-16T16:00:00)used(a1,e1, -, [ex:fct="load"])used(ex:u1; ex:act2, ar3:0111, -)
Additional semantic rules (Section 3.7.5) apply tousageExpression
.
[13] | communicationExpression | ::= | "wasInformedBy" "("optionalIdentifieraIdentifier ","aIdentifieroptionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Communication maps to a PROV-N syntax element.
Communication | Non-Terminal |
id | optionalIdentifier |
informed | aIdentifier |
informant | aIdentifier |
attributes | optionalAttributeValuePairs |
wasInformedBy(ex:inf1; ex:a1, ex:a2, [ex:param1="a", ex:param2="b"])
Hereex:inf1 is the optional communication identifier,ex:a1 is the identifier of the informed activity,ex:a2 is the identifier of the informant activity,and[ex:param1="a", ex:param2="b"] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasInformedBy(ex:a1, ex:a2)wasInformedBy(ex:a1, ex:a2, [ex:param1="a", ex:param2="b"])wasInformedBy(ex:i; ex:a1, ex:a2)wasInformedBy(ex:i; ex:a1, ex:a2, [ex:param1="a", ex:param2="b"])
[14] | startExpression | ::= | "wasStartedBy" "("optionalIdentifieraIdentifier ( ","eIdentifierOrMarker ","aIdentifierOrMarker ","timeOrMarker )?optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Start maps to a PROV-N syntax element.
Start | Non-Terminal |
id | optionalIdentifier |
activity | aIdentifier |
trigger | eIdentifierOrMarker |
starter | aIdentifierOrMarker |
time | timeOrMarker |
attributes | optionalAttributeValuePairs |
wasStartedBy(ex:start; ex:act2, ex:trigger, ex:act1, 2011-11-16T16:00:00, [ex:param="a"])
Herestart is the optional start identifier,ex:act2 is the identifier of the started activity,ex:trigger is the optional identifier for the entity that triggered the activity start,ex:act1 is the optional identifier for the activity that generated the (possibly unspecified) entityex:trigger,2011-11-16T16:00:00 is the optional start time, and [ex:param="a"] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasStartedBy(ex:act2, -, ex:act1, -)wasStartedBy(ex:act2, -, ex:act1, 2011-11-16T16:00:00)wasStartedBy(ex:act2, -, -, 2011-11-16T16:00:00)wasStartedBy(ex:act2, [ex:param="a"])wasStartedBy(ex:start; ex:act2, e, ex:act1, 2011-11-16T16:00:00)
Additional semantic rules (Section 3.7.5) apply tostartExpression
.
[15] | endExpression | ::= | "wasEndedBy" "("optionalIdentifieraIdentifier ( ","eIdentifierOrMarker ","aIdentifierOrMarker ","timeOrMarker )?optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM End maps to a PROV-N syntax element.
End | Non-Terminal |
id | optionalIdentifier |
activity | aIdentifier |
trigger | eIdentifierOrMarker |
ender | aIdentifierOrMarker |
time | timeOrMarker |
attributes | optionalAttributeValuePairs |
wasEndedBy(ex:end; ex:act2, ex:trigger, ex:act3, 2011-11-16T16:00:00, [ex:param="a"])
Hereend is the optional end identifier,ex:act2 is the identifier of the ending activity,ex:trigger is the identifier of the entity that triggered the activity end,ex:act3 is the optional identifier for the activity that generated the (possibly unspecified) entityex:trigger,2011-11-16T16:00:00 is the optional usage time, and [ex:param="a"] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasEndedBy(ex:act2, ex:trigger, -, -)wasEndedBy(ex:act2, ex:trigger, -, 2011-11-16T16:00:00)wasEndedBy(ex:act2, -, -, 2011-11-16T16:00:00)wasEndedBy(ex:act2, -, -, 2011-11-16T16:00:00, [ex:param="a"])wasEndedBy(ex:end; ex:act2)wasEndedBy(ex:end; ex:act2, ex:trigger, -, 2011-11-16T16:00:00)
Additional semantic rules (Section 3.7.5) apply toendExpression
.
[16] | invalidationExpression | ::= | "wasInvalidatedBy" "("optionalIdentifiereIdentifier ( ","aIdentifierOrMarker ","timeOrMarker )?optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Invalidation maps to a PROV-N syntax element.
Invalidation | Non-Terminal |
id | optionalIdentifier |
entity | eIdentifier |
activity | aIdentifierOrMarker |
time | timeOrMarker |
attributes | optionalAttributeValuePairs |
wasInvalidatedBy(ex:inv; tr:WD-prov-dm-20111215, ex:edit1, 2011-11-16T16:00:00, [ex:fct="save"])
Hereex:inv is the optional invalidation identifier,tr:WD-prov-dm-20111215 is the identifier of the entity being invalidated,ex:edit1 is the optional identifier of the invalidating activity,2011-11-16T16:00:00 is the optional invalidation time, and [ex:fct="save"] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasInvalidatedBy(tr:WD-prov-dm-20111215, ex:edit1, -)wasInvalidatedBy(tr:WD-prov-dm-20111215, ex:edit1, 2011-11-16T16:00:00)wasInvalidatedBy(e2, a1, -, [ex:fct="save"]) wasInvalidatedBy(e2, -, -, [ex:fct="save"]) wasInvalidatedBy(ex:inv; tr:WD-prov-dm-20111215, ex:edit1, -)wasInvalidatedBy(tr:WD-prov-dm-20111215, ex:edit1, -)
Additional semantic rules (Section 3.7.5) apply toinvalidationExpression
.
[17] | derivationExpression | ::= | "wasDerivedFrom" "("optionalIdentifiereIdentifier ","eIdentifier ( ","aIdentifierOrMarker ","gIdentifierOrMarker ","uIdentifierOrMarker )?optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Derivation maps to a PROV-N syntax element.
wasDerivedFrom(ex:d; e2, e1, a, g2, u1, [ex:comment="a righteous derivation"])
Hered is the optional derivation identifier,e2 is the identifier for the entity being derived,e1 is the identifier of the entity from whiche2 is derived,a is the optional identifier of the activity which used/generated the entities,g2 is the optional identifier of the generation,u1 is the optional identifier of the usage, and[ex:comment="a righteous derivation"] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasDerivedFrom(e2, e1)wasDerivedFrom(e2, e1, a, g2, u1)wasDerivedFrom(e2, e1, -, g2, u1)wasDerivedFrom(e2, e1, a, -, u1)wasDerivedFrom(e2, e1, a, g2, -)wasDerivedFrom(e2, e1, a, -, -)wasDerivedFrom(e2, e1, -, -, u1)wasDerivedFrom(e2, e1, -, -, -)wasDerivedFrom(ex:d; e2, e1, a, g2, u1)wasDerivedFrom(-; e2, e1, a, g2, u1)
PROV-N provides no dedicated syntax for Revision. Instead, a RevisionMUST be expressed as aderivationExpression
with attributeprov:type='prov:Revision'
.
wasDerivedFrom(ex:d; e2, e1, a, g2, u1, [prov:type='prov:Revision', ex:comment="a righteous derivation"])
Here, the derivation fromExample 19is extended with aprov:type attribute and valueprov:Revision. The expression'prov:Revision'
isconvenienceNotation
to denote aQUALIFIED_NAME
literal (SeeSection 3.7.3. Literal).
PROV-N provides no dedicated syntax for Quotation. Instead, a QuotationMUST be expressed as aderivationExpression
with attributeprov:type='prov:Quotation'
.
wasDerivedFrom(ex:quoteId1; ex:blockQuote,ex:blog, ex:act1, ex:g, ex:u, [ prov:type='prov:Quotation' ])
Here, the derivation is provided with aprov:type attribute and valueprov:Quotation.
PROV-N provides no dedicated syntax for PrimarySource. Instead, a PrimarySourceMUST be expressed as aderivationExpression
with attributeprov:type='prov:Primary-Source'
.
wasDerivedFrom(ex:sourceId1; ex:e1, ex:e2, ex:act, ex:g, ex:u, [ prov:type='prov:PrimarySource' ])
Here, the derivation is provided with aprov:type attribute and valueprov:PrimarySource.
[18] | agentExpression | ::= | "agent" "("identifieroptionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Agent maps to a PROV-N syntax element.
PROV-N provides no dedicated syntax for Person, Organization, SoftwareAgent. Instead, a Person, an Organization, or a SoftwareAgentMUST be expressed as anagentExpression
with attributeprov:type='prov:Person'
,prov:type='prov:Organization'
, orprov:type='prov:SoftwareAgent'
, respectively.
agent(ex:ag4, [ prov:type='prov:Person', ex:name="David" ])
Hereag is the agent identifier, and[ prov:type='prov:Person', ex:name="David" ] is a list of optional attributes.
In the next example, the optional attributes are omitted.agent(ex:ag4)
[19] | attributionExpression | ::= | "wasAttributedTo" "("optionalIdentifiereIdentifier ","agIdentifieroptionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Attribution maps to a PROV-N syntax element.
Attribution | Non-Terminal |
id | optionalIdentifier |
entity | eIdentifier |
agent | agIdentifier |
attributes | optionalAttributeValuePairs |
wasAttributedTo(ex:attr; e, ag, [ex:license='cc:attributionURL' ])
Hereattr is the optional attribution identifier,e is an entity identifier,ag is the identifier of the agent to whom the entity is ascribed, and[ex:license='cc:attributionURL' ] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasAttributedTo(e, ag)wasAttributedTo(e, ag, [ex:license='cc:attributionURL' ])
[20] | associationExpression | ::= | "wasAssociatedWith" "("optionalIdentifieraIdentifier ( ","agIdentifierOrMarker ","eIdentifierOrMarker )?optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Association maps to a PROV-N syntax element.
Association | Non-Terminal |
id | optionalIdentifier |
activity | aIdentifier |
agent | agIdentifierOrMarker |
plan | eIdentifierOrMarker |
attributes | optionalAttributeValuePairs |
PROV-N provides no dedicated syntax for Plan. Instead, a PlanMUST be expressed as anentityExpression
with attributeprov:type='prov:Plan'
.
wasAssociatedWith(ex:assoc; ex:a1, ex:ag1, ex:e1, [ex:param1="a", ex:param2="b"])
Hereex:assoc is the optional attribution identifier,ex:a1 is an activity identifier,ex:ag1 is the optional identifier of the agent associated to the activity,ex:e1 is the optional identifier of the plan used by the agent in the context of the activity, and [ex:param1="a", ex:param2="b"] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.wasAssociatedWith(ex:a1, -, ex:e1)wasAssociatedWith(ex:a1, ex:ag1)wasAssociatedWith(ex:a1, ex:ag1, ex:e1)wasAssociatedWith(ex:a1, ex:ag1, ex:e1, [ex:param1="a", ex:param2="b"])wasAssociatedWith(ex:assoc; ex:a1, -, ex:e1)
Additional semantic rules (Section 3.7.5) apply toassociationExpression
.
[21] | delegationExpression | ::= | "actedOnBehalfOf" "("optionalIdentifieragIdentifier ","agIdentifier ( ","aIdentifierOrMarker )?optionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Delegation maps to a PROV-N syntax element.
Delegation | Non-Terminal |
id | optionalIdentifier |
delegate | agIdentifier |
responsible | agIdentifier |
activity | aIdentifierOrMarker |
attributes | optionalAttributeValuePairs |
actedOnBehalfOf(ex:del1; ex:ag2, ex:ag1, ex:a, [prov:type="contract"])
Hereex:del1 is the optional delegation identifier,ex:ag2 is the identifier for the delegate agent,ex:ag1 is the identifier of the responsible agent,ex:a is the optional identifier of the activity for which the delegation link holds, and[prov:type="contract"] is a list of optional attributes.
The remaining examples show cases where some of the optionals are omitted.actedOnBehalfOf(ex:ag1, ex:ag2)actedOnBehalfOf(ex:ag1, ex:ag2, ex:a)actedOnBehalfOf(ex:ag1, ex:ag2, -, [prov:type="delegation"])actedOnBehalfOf(ex:ag2, ex:ag3, ex:a, [prov:type="contract"])actedOnBehalfOf(ex:del1; ex:ag2, ex:ag3, ex:a, [prov:type="contract"])
[22] | influenceExpression | ::= | "wasInfluencedBy" "("optionalIdentifiereIdentifier ","eIdentifieroptionalAttributeValuePairs ")" |
The following table summarizes how each constituent of a PROV-DM Influence maps to a PROV-N syntax element.
Influence | Non-Terminal |
id | optionalIdentifier |
influencee | eIdentifier |
influencer | eIdentifier |
attributes | optionalAttributeValuePairs |
wasInfluencedBy(ex:infl1;e2,e1,[ex:param="a"])
Here,ex:infl1 is the optional influence identifier,ex:e2 is an entity identifier,ex:e1 is the identifier for an ancestor entity thatex:e2 is influenced by, and[ex:param="a"] is the optional set of attributes.
The remaining examples show cases where some of the optionals are omitted.wasInfluencedBy(ex:e2,ex:e1)wasInfluencedBy(ex:e2,ex:e1,[ex:param="a"])wasInfluencedBy(ex:infl1; ex:e2,ex:e1)
[23] | bundle | ::= | "bundle"identifier (namespaceDeclarations)? (expression)* "endBundle" |
Bundles cannot be nested. It is for this reason that abundle
is not defined as anexpression
, to prevent the occurrence of abundle
inside anotherbundle
.
Each identifier occurring in a bundle, including the bundle identifier itself,MUST be interpreted with respect to the namespace declarations of that bundle, or if the identifier's prefix is not declared in the bundle, with respect to the namespace declarations in the document.
The following table summarizes how each constituent of a PROV-DM bundle maps to a PROV-N syntax element.
bundle ex:author-view prefix ex <http://example.org/> agent(ex:Paolo, [ prov:type='prov:Person' ]) agent(ex:Simon, [ prov:type='prov:Person' ]) //...endBundle
Hereex:author-view is the name of the bundle.
When described, a BundleMUST be expressed as anentityExpression
with attributeprov:type='prov:Bundle'
.
The bundle ofExample 29 can be referred to as an entity, and its provenance described.
entity(ex:author-view, [ prov:type='prov:Bundle' ])
[24] | alternateExpression | ::= | "alternateOf" "("eIdentifier ","eIdentifier ")" |
The following table summarizes how each constituent of a PROV-DM Alternate maps to a PROV-N syntax element.
alternateOf(tr:WD-prov-dm-20111215,ex:alternate-20111215)Heretr:WD-prov-dm-20111215 is alternate forex:alternate-20111215.
[25] | specializationExpression | ::= | "specializationOf" "("eIdentifier ","eIdentifier ")" |
The following table summarizes how each constituent of a PROV-DM Specialization maps to a PROV-N syntax element.
specializationOf(tr:WD-prov-dm-20111215,tr:prov-dm)Heretr:WD-prov-dm-20111215 is a specialization oftr:prov-dm.
PROV-N provides no dedicated syntax for Collection and EmptyCollection. Instead, a Collection or an EmptyCollectionMUST be expressed as anentityExpression
with attributeprov:type='prov:Collection'
, orprov:type='prov:EmptyCollection'
, respectively.
The following two expressions are about a collection and an empty collection, respectively.
entity(ex:col1, [ prov:type='prov:Collection' ]) entity(ex:col2, [ prov:type='prov:EmptyCollection' ])
[26] | membershipExpression | ::= | "hadMember" "("cIdentifier ","eIdentifier ")" |
The following table summarizes how each constituent of a PROV-DM Membership maps to a PROV-N syntax element.
Membership | Non-Terminal |
collection | cIdentifier |
entity | eIdentifier |
hadMember(ex:c, ex:e1) // ex:c contained ex:e1 hadMember(ex:c, ex:e2) // ex:c contained ex:e2
Hereex:c is the identifier for the collection whose membership is stated, andex:e1 andex:e2 are the entities that are members of collectionex:c.
Various kinds of identifiers are used in productions.
[27] | eIdentifier | ::= | identifier |
[28] | aIdentifier | ::= | identifier |
[29] | agIdentifier | ::= | identifier |
[30] | gIdentifier | ::= | identifier |
[31] | uIdentifier | ::= | identifier |
[32] | cIdentifier | ::= | identifier |
[33] | eIdentifierOrMarker | ::= | (eIdentifier | "-" ) |
[34] | aIdentifierOrMarker | ::= | (aIdentifier | "-" ) |
[35] | agIdentifierOrMarker | ::= | (agIdentifier | "-" ) |
[36] | gIdentifierOrMarker | ::= | (gIdentifier | "-" ) |
[37] | uIdentifierOrMarker | ::= | (uIdentifier | "-" ) |
[38] | identifier | ::= | QUALIFIED_NAME |
Aqualified name is a name subject to namespace interpretation. It consists of a namespace, denoted by an optional prefix, and a local name.The PROV data model stipulates that a qualified name can be mapped to an IRI by concatenating the IRI associated with the prefix and the local part. This section provides the exact details of this procedure for qualified names defined by PROV-N.
A qualified name's prefix isOPTIONAL. If a prefix occurs in a qualified name, the prefixMUST refer to a namespace declared in a namespace declaration. In the absence of prefix, the qualified name belongs to the default namespace.
A PROV-N qualified name (productionQUALIFIED_NAME
) has a more permissive syntax than XML'sQName
[XML-NAMES]and SPARQLPrefixedName
[RDF-SPARQL-QUERY]. AQUALIFIED_NAME
consists of a prefix and a local part. Prefixes follow the productionPN_PREFIX
defined by SPARQL [RDF-SPARQL-QUERY]. Local parts have to be conformant withPN_LOCAL
, which extends the original SPARQLPN_LOCAL
definition by allowing further characters (seePN_CHARS_OTHERS
):
PERCENT
) to be interpreted as per Section 3.1. Mapping of IRIs to URIs in [RFC3987];PN_CHARS_ESC
).Given that '=' (equal),''' (single quote),'(' (left bracket),')' (right bracket),',' (comma),':' (colon),';' (semi-colon), '"' (double quote), '[' (left square bracket),']' (right square bracket) are used by the PROV notation as delimiters, they are not allowed in local parts. Instead, among those characters, those that are permitted in SPARQLIRI_REF
are also allowed inPN_LOCAL
if they are escaped by the '\' (backslash character) as per productionPN_CHARS_ESC
. Furthermore, '.' (dot), ':' (colon), '-' (hyphen) can also be \-escaped.
A PROV-N qualified nameQUALIFIED_NAME
can be mapped to a valid IRI [RFC3987] by concatenating the namespace denoted its local namePN_PREFIX
to the local namePN_LOCAL
, whose \-escaped characters have been unescaped by dropping the character '\' (backslash).
[52] | <QUALIFIED_NAME > | ::= | (PN_PREFIX ":" )?PN_LOCAL |
[53] | <PN_LOCAL > | ::= | (PN_CHARS_U | [0-9] |PN_CHARS_OTHERS ) ( (PN_CHARS | "." |PN_CHARS_OTHERS )* (PN_CHARS |PN_CHARS_OTHERS ) )? |
[54] | <PN_CHARS_OTHERS > | ::= | "/" |
[55] | <PN_CHARS_ESC > | ::= | "\" ( "=" | "'" | "(" | ")" | "," | "-" | ":" | ";" | "[" | "]" | "." ) |
[56] | <PERCENT > | ::= | "%"HEXHEX |
[57] | <HEX > | ::= | [0-9] |
Examples of articles on the BBC Web site seen as entities.
document prefix bbc <http://www.bbc.co.uk/> prefix bbcNews <http://www.bbc.co.uk/news/> entity(bbc:) // bbc site itself entity(bbc:news/) // bbc news entity(bbc:news/world-asia-17507976) // a given news article entity(bbcNews:) // an alternative way of referring to the bbc news siteendDocument
Examples of entities with declared and default namespace.
document default <http://example.org/2/> prefix ex <http://example.org/1/> entity(ex:a) // corresponds to IRI http://example.org/1/a entity(ex:a/) // corresponds to IRI http://example.org/1/a/ entity(ex:a/b) // corresponds to IRI http://example.org/1/a/b entity(b) // corresponds to IRI http://example.org/2/b entity(ex:1234) // corresponds to IRI http://example.org/1/1234 entity(4567) // corresponds to IRI http://example.org/2/4567 entity(c/) // corresponds to IRI http://example.org/2/c/ entity(ex:/) // corresponds to IRI http://example.org/1//endDocument
Examples of \-escaped characters.
document prefix ex <http://example.org/> default <http://example.org/default> entity(ex:foo?a\=1) // corresponds to IRI http://example.org/foo?a=1 entity(ex:\-) // corresponds to IRI http://example.org/- entity(ex:?fred\=fish%20soup) // corresponds to IRI http://example.org/?fred=fish%20soup used(-;a1,e1,-) // identifier not specified for usage used(\-;a1,e1,-) // usage identifier corresponds to http://example.org/default-endDocument
Note: The productions for the terminalsQUALIFIED_NAME
andPN_PREFIX
are conflicting.Indeed, for a tokenizer operating independently of the parse tree,abc
matches bothQUALIFIED_NAME
andPN_PREFIX
. In the context of anamespaceDeclaration
, a tokenizer should give preference to the productionPN_PREFIX
.
[39] | attribute | ::= | QUALIFIED_NAME |
The reserved attributes in the PROV namespace are the following.Their meaning is explained by [PROV-DM] (seeSection 5.7.2: Attribute).
[40] | literal | ::= | typedLiteral |
[41] | typedLiteral | ::= | STRING_LITERAL "%%"datatype |
[42] | datatype | ::= | QUALIFIED_NAME |
[43] | convenienceNotation | ::= | STRING_LITERAL (LANGTAG)? |
[58] | <STRING_LITERAL > | ::= | STRING_LITERAL2 |
[60] | <INT_LITERAL > | ::= | ("-")? (DIGIT)+ |
[62] | <DIGIT > | ::= | [0-9] |
[61] | <QUALIFIED_NAME_LITERAL > | ::= | "'"QUALIFIED_NAME "'" |
In productiondatatype
, theQUALIFIED_NAME
is used to denote aPROV data type [PROV-DM].
The non terminalsSTRING_LITERAL
,INT_LITERAL
, andQUALIFIED_NAME_LITERAL
are syntactic sugar for quoted strings with datatypexsd:string
,xsd:int
, andprov:QUALIFIED_NAME
respectively.
In particular, a Literal may be an IRI-typed string (with datatypexsd:anyURI); such IRI has no specific interpretation in the context of PROV.
Note: The productions for terminalsQUALIFIED_NAME
andINT_LITERAL
are conflicting. Indeed, for a tokenizer operating independently of the parse tree,1234
matches bothINT_LITERAL
andQUALIFIED_NAME
(local name without prefix). In the context ofaconvenienceNotation
, a tokenizer should give preference to the productionINT_LITERAL
.
The following examples illustrate convenience notations.
The two following expressions are strings; ifdatatype
is not specified, it isxsd:string
.
"abc" %% xsd:string "abc"
The two following expressions are integers. For convenience, numbers, expressed as digits optionally preceded by a minus sign, can occur without quotes.
"1234" %% xsd:integer 1234 "-1234" %% xsd:integer -1234
The two following expressions are qualified names. Values of type qualified name can be conveniently expressed within single quotes.
"ex:value" %% prov:QUALIFIED_NAME 'ex:value'
The following examples respectively are the string "abc", the string (in French) "bonjour", the integer number 1, and the IRI "http://example.org/foo".
"abc" "bonjour"@fr "1" %% xsd:integer "http://example.org/foo" %% xsd:anyURI
The following examples respectively are the floating point number 1.01 and the boolean true.
"1.01" %% xsd:float "true" %% xsd:boolean
The reserved type values in the PROV namespace are the following.Their meaning is defined [PROV-DM] (seeSection 5.7.2.4: prov:type).
The agentag is a person (type:prov:Person), whereas the entitypl is a plan (type:prov:Plan).
agent(ag, [ prov:type='prov:Person' ])entity(pl, [ prov:type='prov:Plan' ])
Time instants are defined according to xsd:dateTime [XMLSCHEMA11-2].
[44] | time | ::= | DATETIME |
The third argument in the following usage expression is a time instance, namely 4pm on 2011-11-16.
used(ex:act2, ar3:0111, 2011-11-16T16:00:00)
[45] | namespaceDeclarations | ::= | (defaultNamespaceDeclaration |namespaceDeclaration ) (namespaceDeclaration)* |
[46] | namespaceDeclaration | ::= | "prefix"PN_PREFIXnamespace |
[47] | defaultNamespaceDeclaration | ::= | "default"IRI_REF |
[48] | namespace | ::= | IRI_REF |
AnamespaceDeclaration
consists of a binding between a prefix and a namespace. Every qualified name with this prefix in the scope of this declaration belongs to this namespace. AdefaultNamespaceDeclaration
consists of a namespace. Every qualified name without prefix in the scope of this declaration belongs to this namespace. Scope of a prefix-namespace declaration is specified as follows:
bundle
is thebundle
itself.document
is thedocument
including thebundles
it contains but excluding thosebundles
that re-declare this prefix.A set of namespace declarationsnamespaceDeclarations
MUST NOT re-declare the same prefix.
A set of namespace declarationsnamespaceDeclarations
occurring in a bundleMAY re-declare a prefix declared in a surrounding document.
A namespace declarationnamespaceDeclaration
MUST NOT declare prefixesprov andxsd (seeTable 1 for their IRI).
The following example declares three namespaces, one default, and two with explicit prefixesex1 andex2.
document default <http://example.org/0/> prefix ex1 <http://example.org/1/> prefix ex2 <http://example.org/2/>...endDocument
In the following example, a document declares a default namespace and the occurrence ofe001
directly occurring in the document refers to that namespace.A nested bundle also declares a default namespace, but with a different IRI.In that bundle, the occurrences ofe001
, including the bundle name, refer to the latest default namespace.
document default <http://example.org/1/> entity(e001) // IRI: http://example.org/1/e001 bundle e001 // IRI: http://example.org/2/e001 default <http://example.org/2/> entity(e001) // IRI: http://example.org/2/e001 endBundleendDocument
In the following example, a document declares a namespace with prefixex
and the occurrence ofex:e001
directly occurring in the document refers to that namespace.In a nested bundle, the occurrence ofex:e001
also refers to the same namespace.
document prefix ex <http://example.org/1/> entity(ex:e001) // IRI: http://example.org/1/e001 bundle b entity(ex:001) // IRI: http://example.org/1/e001 endBundleendDocument
Some of the grammar productions allow for expressions that are syntactically correct, and yet according to [PROV-DM] they are not acceptable, because additional semantic rules are defined for those expressions. The following table provides a summary of such expressions along with examples of syntactically correct but unacceptable forms, and the additional semantic rules.
Production | Examples of syntactically correct expressions | Additional semantic rule |
Generation expression | wasGeneratedBy(e2, -, -) wasGeneratedBy(-; e2, -, -) | At least one ofid,activity,time, andattributesMUST be present. |
Usage expression | used(a2, -, -) used(-; a2, -, -) | At least one ofid,entity,time, andattributesMUST be present |
Start expression | wasStartedBy(e2, -, -, -) wasStartedBy(-; e2, -, -, -) | At least one ofid,trigger,starter,time, andattributesMUST be present |
End expression | wasEndedBy(e2, -, -, -) wasEndedBy(-; e2, -, -, -) | At least one ofid,trigger,ender,time, andattributesMUST be present |
Invalidation expression | wasInvalidatedBy(e2, -, -) wasInvalidatedBy(-; e2, -, -) | At least one ofid,activity,time, andattributesMUST be present |
Association expression | wasAssociatedWith(a, -, -) wasAssociatedWith(-; a, -, -) | At least one ofid,agent,plan, andattributesMUST be present |
Adocument is a house-keeping construct of PROV-N capable of packaging up PROV-N expressions and namespace declarations. A document forms a self-contained package of provenance descriptions for the purpose ofexchanging them. A document may be used to package up PROV-N expressions in response to a request for the provenance of something ([PROV-AQ]).
Given its status of house-keeping construct for the purpose of exchanging provenance expressions, a document is not defined as a PROV-N expression (productionexpression
).
A document's text matches thedocument
production.
[1] | document | ::= | "document" (namespaceDeclarations)? (expression)* (bundle)* "endDocument" |
A document contains:
namespaceDeclarations
, declaring namespaces and associated prefixes, which can be used in attributes and identifiers occurring insideexpressions orbundles;expression
;bundle
.Thus, bundlesMAY occur inside a document, but do not appear inside other bundles.
The following document contains expressions related to the provenance of entitye2.
document default <http://anotherexample.org/> prefix ex <http://example.org/> entity(e2, [ prov:type="File", ex:path="/shared/crime.txt", ex:creator="Alice", ex:content="There was a lot of crime in London last month."]) activity(a1, 2011-11-16T16:05:00, -, [prov:type="edit"]) wasGeneratedBy(e2, a1, -, [ex:fct="save"]) wasAssociatedWith(a1, ag2, -, [prov:role="author"]) agent(ag2, [ prov:type='prov:Person', ex:name="Bob" ])endDocument
This container could, for instance, be returned as the result of a query to a provenance store for the provenance of entitye2 [PROV-AQ].
The PROV data model is extensible by means of attributesprov:type andprov:role allowing subtyping of expressions. For some applications, novel syntax may also be convenient. Hence, the normative requirements are as follow.
extensibilityExpression
production defined below.[49] | extensibilityExpression | ::= | QUALIFIED_NAME "("optionalIdentifierextensibilityArgument ( ","extensibilityArgument )*optionalAttributeValuePairs ")" |
[50] | extensibilityArgument | ::= | (identifierOrMarker |literal |time |extensibilityExpression |extensibilityTuple ) |
[51] | extensibilityTuple | ::= | "{"extensibilityArgument ( ","extensibilityArgument )* "}" |
Expressions compatible with theextensibilityExpression
production follow a general form of functional syntax, in which the predicateMUST be aqualifiedName
with a non-emptyprefix
.
Collections are sets of entities, whose membership can be expressed using thehadMember relation. The following example shows how one can express membership fordictionaries, an illustrative extension of Collections consisting of sets of key-entity pairs, where a key is aliteral
. The notation is a variation of that used for Collections membership, allowing multiple member elements to be declared, and in which the elements are pairs. The name of the relation is qualified with the extension-specific namespacehttp://example.org/dictionaries.
prefix dictExt <http://example.org/dictionaries#> dictExt:hadMembers(mId; d, {("k1",e1), ("k2",e2), ("k3",e3)}, [])Note that the generic
extensibilityExpression
production above allows for alternative notations to be used for expressing membership, if the designers of the extensions so desire. Here is an alternate syntax that is consistent with the productions:prefix dictExt <http://example.org/dictionaries#> dictExt:hadMembers(mid; d, dictExt:set(dictExt:pair("k1",e1), dictExt:pair("k2",e2), dictExt:pair("k3",e3)), [dictExt:uniqueKeys="true"])
The media type of PROV-N istext/provenance-notation. The content encoding of PROV-N content is UTF-8.
The Internet Media Type / MIME Type for PROV-N is "text/provenance-notation".
It is recommended that PROV-N files have the extension ".provn" (all lowercase) on all platforms.
It is recommended that PROV-N files stored on Macintosh HFS file systems be given a file type of "TEXT".
The information that follows has beenregistered with IANA.
This section is non-normative.
This section is non-normative.
This section is non-normative.
This section is non-normative.
Please see theResponses to Public Comments on the Last Call Working Draft for more details about the justification of these changes.
This section is non-normative.
This document has been produced by the Provenance Working Group, and its contents reflect extensive discussion within the Working Group as a whole. The editors extend special thanks to Sandro Hawke (W3C/MIT) and Ivan Herman (W3C/ERCIM),W3C contacts for the Provenance Working Group.
The editors acknowledge valuable contributions from the following:Tom Baker,David Booth,Robert Freimuth,Satrajit Ghosh,Ralph Hodgson,Renato Iannella,Jacek Kopecky,James Leigh,Jacco van Ossenbruggen,Alan Ruttenberg,Reza Samavi, andAntoine Zimmermann.
Members of the Provenance Working Group at the time of publication of this document were:Ilkay Altintas (Invited expert),Reza B'Far (Oracle Corporation),Khalid Belhajjame (University of Manchester),James Cheney (University of Edinburgh, School of Informatics),Sam Coppens (iMinds - Ghent University),David Corsar (University of Aberdeen, Computing Science),Stephen Cresswell (The National Archives),Tom De Nies (iMinds - Ghent University),Helena Deus (DERI Galway at the National University of Ireland, Galway, Ireland),Simon Dobson (Invited expert),Martin Doerr (Foundation for Research and Technology - Hellas(FORTH)),Kai Eckert (Invited expert),Jean-Pierre EVAIN (European Broadcasting Union, EBU-UER),James Frew (Invited expert),Irini Fundulaki (Foundation for Research and Technology - Hellas(FORTH)),Daniel Garijo (Universidad Politécnica de Madrid),Yolanda Gil (Invited expert),Ryan Golden (Oracle Corporation),Paul Groth (Vrije Universiteit),Olaf Hartig (Invited expert),David Hau (National Cancer Institute, NCI),Sandro Hawke (W3C/MIT),Jörn Hees (German Research Center for Artificial Intelligence (DFKI) Gmbh),Ivan Herman, (W3C/ERCIM),Ralph Hodgson (TopQuadrant),Hook Hua (Invited expert),Trung Dong Huynh (University of Southampton),Graham Klyne (University of Oxford),Michael Lang (Revelytix, Inc.),Timothy Lebo (Rensselaer Polytechnic Institute),James McCusker (Rensselaer Polytechnic Institute),Deborah McGuinness (Rensselaer Polytechnic Institute),Simon Miles (Invited expert),Paolo Missier (School of Computing Science, Newcastle university),Luc Moreau (University of Southampton),James Myers (Rensselaer Polytechnic Institute),Vinh Nguyen (Wright State University),Edoardo Pignotti (University of Aberdeen, Computing Science),Paulo da Silva Pinheiro (Rensselaer Polytechnic Institute),Carl Reed (Open Geospatial Consortium),Adam Retter (Invited Expert),Christine Runnegar (Invited expert),Satya Sahoo (Invited expert),David Schaengold (Revelytix, Inc.),Daniel Schutzer (FSTC, Financial Services Technology Consortium),Yogesh Simmhan (Invited expert),Stian Soiland-Reyes (University of Manchester),Eric Stephan (Pacific Northwest National Laboratory),Linda Stewart (The National Archives),Ed Summers (Library of Congress),Maria Theodoridou (Foundation for Research and Technology - Hellas(FORTH)),Ted Thibodeau (OpenLink Software Inc.),Curt Tilmes (National Aeronautics and Space Administration),Craig Trim (IBM Corporation),Stephan Zednik (Rensselaer Polytechnic Institute),Jun Zhao (University of Oxford),Yuting Zhao (University of Aberdeen, Computing Science).