Copyright © 2023OGC &World Wide Web Consortium.W3C®liability,trademark,W3C andOGC document use rules apply.
This document advises on best practices related to the publication of spatial data on the Web; the use of Web technologiesas they may be applied to location. The best practices presented here are intended for practitioners, including Web developers and geospatial experts, and are compiled based on evidence of real-world application. These best practices suggest a significant change of emphasis from traditionalSpatial Data Infrastructures by adopting an approach based on general Web standards. As location is often the common factor across multiple datasets,spatial data is an especially useful addition to the Web of data.
This section describes the status of this document at the time of its publication. A list of currentW3C publications and the latest revision of this technical report can be found in theW3C technical reports index at
This document is considered to be complete and is expected to be the final release by theSpatial Data on the Web Working Group. The editors would like to thank everyone for their feedback. Comments received during final review triggered a couple of updates since the previous release on 11 May 2017 (seeG.Changes since previous versions for details). This document is published as aW3C Working Group Note and as anOGC Best Practice in accordance withW3C Policy section 6.8 Publishing a Working Group or Interest Group Note andOGC Policies and Procedures section 8.6 Best Practices Documents.
ForOGC: This document defines anOGC Best Practice on a particular technology or approach related to anOGC standard. This document is not anOGC Standard and may not be referred to as anOGC Standard. However, this document is an official position of theOGC membership on this particular technology topic. This document was prepared by the Spatial Data on the Web Working Group (SDWWG) — a jointW3C-OGC project (seecharter) — followingW3C conventions.
This document was published by theSpatial Data on the Web Working Group as a Group Draft Note using theNote track.
Group Draft Notes are not endorsed byW3C nor its Members.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
TheW3C Patent Policy does not carry any licensing requirements or commitments on this document.
This document is governed by the12 June 2023W3C Process Document.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The key wordsMAY,MUST,MUST NOT,OPTIONAL,RECOMMENDED,REQUIRED,SHALL,SHALL NOT,SHOULD, andSHOULD NOT in this document are to be interpreted as described inBCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
This section is non-normative.
Increasing numbers of Web applications provide a means of accessing data. From simple visualizations to sophisticated interactive tools, there is a growing reliance on data. The open data movement has led to many national, regional and local governments publishing their data through portals. Scientific and cultural heritage data is increasingly published on the Web for reuse by others. Crowd-sourced and social media data are abundant on the Web. Sensors, connected devices and services from domains such as energy, transport, manufacturing and healthcare are becoming commonly integrated using the Web as a common data sharing platform.
The Data on the Web Best Practices [DWBP] provide a set of recommendations that are applicable to the publication ofall types of data on the Web. Those best practices cover aspects including data formats, data access, data identifiers, metadata, licensing and provenance.
Within this document, we are concerned withspatial data: data that describesanything with spatialextent (i.e. size, shape or position).Spatial data is also known aslocation information.
Similarly to the challenges identified in [DWBP] relating to publishing data on the Web, and therefore not making use of the full potential of the Web as a data sharing platform, there is a lack of consistency in how people publishspatial data.
It is not that there is a lack ofspatial data on the Web; the maps, satellite and street level images offered by search engines are familiar and there are many more examples of spatial data being used in Web applications.
SFpark is a Web site where users can look at a map of San Francisco to see where parking is available and at what price. Parking prices are incrementally raised or lowered in SFpark pilot areas based on demand. In this application, static data on existing parking spaces is combined with changing data about where people park the most, which is measured by parking sensors.
However, the data that has been published is difficult to find and often problematic to access for non-specialist users. The key problems we are trying to solve in this document are discoverability, accessibility, and interoperability. Our overarching goal is to enablespatial data to be integrated within the wider Web of data; providing standard patterns and solutions that help solve these problems.
Following these guidelines should result in your data fitting more with theFAIR Principles.
This section is non-normative.
Our goal in writing this best practice document is to support the practitioners who are responsible for publishing theirspatial data on the Web or developing tools to make it easy for others to work with spatial data.
We expect readers to be familiar both with the fundamental concepts of the architecture of the Web [WEBARCH] and the generalized best practices related to the publication and usage of data on the Web [DWBP].
We aim to provide two primary pathways into these best practices:
In each case, we aim to help them provide incremental value to their data through application of these best practices.
This document provides a wide range of examples that illustrate how these best practices may be applied using specific technologies. We do not expect readers to be familiar with all the technologies used herein; rather that readers can identify with the activities being undertaken in the various examples and, in doing so, find relevant technologies that they are already aware of or discover technologies that are new to them.
This section is non-normative.
All the best practices described in [DWBP] are relevant to the publication ofspatial data on the Web. Some, such as [DWBP]Best Practice 4: Provide data license information need no further elaboration in the context ofspatial data. However, other best practices from [DWBP] are further refined in this document to provide more specific guidance forspatial data.
The best practices described below are intended to meet requirements derived from the scenarios in [SDW-UCR] that describe howspatial data is commonly published and used on the Web. However, working withspatial data can rapidly become complex — especially for critical decision-making where misuse of data can present risks. These best practices are intended to make it easier to work withspatial data on the Web but do not attempt to cover all aspects ofspatial data usage.
Should a reference to [RESPONSIBLE-USE-SPATIAL] be included here, or elsewhere in the document?
In line with thecharter, this document provides advice on:
As stated in thecharter, discussion of activities relating to renderingspatial data as maps is explicitly out of scope.
The original intent of these best practices was to cover aspects relating to all types ofspatial data, for example: the arrangement of cells on a microscope slide; the position of things on the surface of the Earth, the Moon, Mars or other celestial bodies; the position of planets in the solar system etc. However, due to resource limitations these best practices deal almost exclusively withgeospatial data; data about things that are implicitly or explicitly located relative to the Earth. That said, many of the best practices are applicable to widerspatial data concerns. In the remainder of the document, we simply refer tospatial data for brevity.
We extend [DWBP] to cover aspects specifically relating tospatial data, introducing new best practices only where necessary. In particular, we consider the individual resources, orSpatial Things, that are described within a dataset.
In this document, we focus on the needs of data publishers and the developers that provide tools for them. That said, we recognize that value can only be gained frompublishing thespatial data when peopleuse it! Although we do not directly address the needs of those users, we ask that data publishers and developers reading this document do not forget about them; moreover, that they always consider the needs of users when publishing spatial data or developing the supporting tools. All our best practices are intended to provide guidance about publishingspatial data to improve ease of use.
Neither the wider topic of spatial datamanagement norSpatial Data Infrastructures are covered. We assume that your spatial data already exists and will be available from one of the following places:
If yourspatial data is managed within a software system it is likely that you will be able to access that data through one or more of the methods identified above; as structured data from a bulk extract (e.g. a “data dump”), via direct access to the underpinning data repository or through a bespoke or standards-compliantAPI provided by the system.
Each of the four starting points outlined above have their own challenges, but working with plain text documents can be particularly tricky as you will need to parse the natural language to identify theSpatial Things and their properties before you can proceed any further. Natural Language Processing (NLP) is a complex topic in its own right and is beyond the scope of this best practice document. We will assume that you’ve already completed this step and have parsed any plain documents into structured data records of some kind.
The FAIR Principles are described atFAIR Principles - GO FAIR; they are widely adopted (or at least aimed for) when publishing scientific data including environmental and earth observation data. Although the FAIR principles concentrate on machine-readable data, whilst these best practices also cover “data for humans”, there is a lot of overlap between the FAIR Principles and the best practices described in this paper.
Similarly, although not currently expressed in terms of the FAIR Principles, the Data on the Web Best Practices are also designed to make it easier for "data consumers to find, use and link to the data".
TheOGC is developing a new generation of resource-oriented HTTPAPIs (e.g.OGCAPI - Features [OAF1]) for spatial data that align closely with FAIR principles.
There have also been some suggestions for improvement on the FAIR principles, and these are also discussedD.FAIR Principles.
The best practices described in this document are compiled based on evidence from real-world application in production environments. By ‘production environment’ we mean a case wherespatial data has been delivered on the Web with the intention of being used by end users and with a quality level expected from such data. Where theWorking Group has identified issues that inhibit the use or interoperability of spatial data on the Web, yet no evidence of real-world application is available, the editors present these issues to the reader for consideration, along with any approaches recommended by the Working Group. Please see15.Gaps in current practice for further details. Such recommendations are clearly distinguished as such to ensure that they are not confused with evidence-based best practices.
The normative element of each best practice is theintended outcome. Possible implementations are suggested and, where appropriate, these recommend the use of a particular technology.
We intend this best practice to be durable; that is that the best practices remain relevant for many years to come as specific technologies change. However, to provide actionable guidance, i.e. to provide readers with the technical information they need to get theirspatial data on the Web, we try to balance between durable advice (that is necessarily general) and examples using currently available technologies that illustrate how these best practices can be implemented. We expect that readers will continue to be able to derive insight from the examples even when those specifically mentioned technologies are no longer in common usage, understanding that technology ‘y’ has replaced technology ‘x’.
There are many situations where the location of a person is very useful; from using a taxi-hailing service togeocoding a selfie. Technology makes this location information easy to collect and share. However,spatial data has particular characteristics which makes its use potentially more complex. For example, a single location of an anonymous tracked mobile phone may cause few privacy concerns, however the same phone tracked over a few days could provide enough information to make the identification of its user possible. Like all personally identifiable information, great care must be taken as the collection, management and security of such information is the subject of legal frameworks. We do not attempt to provide guidance as to legal aspects of storing potentially personally identifiable spatial information; expert legal advice should be obtained. In summary: legal and privacy considerations relating to spatial data are out of scope.
Data ethics and the responsible use of geospatial data is a topic which has become evermore relevant with growing amounts of spatial data readily available over the internet. To ensure that data is shared in a responsible way, the data practitioner must provide spatial data and tools that access spatial data in ways that anticipate the impact of their publication on potential stakeholders and which have been created using ethical principles.
This document contains a variety of best practices related to the publication and usage ofspatial data on the Web. First, it continues with several more in-depth introductions onSpatial Things andgeometry,coverages,spatial relations,coordinate reference systems,linked data, andSpatial Data Infrastructures. After that, the best practices themselves are described.
The following best practices can be found in this document:
This section is non-normative.
This document uses a unique abbreviation ("prefix") for each RDF namespace and XML namespace listed in this section. The namespace IRI can always be determined from the declaration of the namespace abbreviation.
The following RDF namespace prefixes are used within this document. Use of a namespace does not imply endorsement of the associated data platform or vocabulary.
Prefix | Namespace IRI | Source |
admingeo | | Ordnance Survey'sAdministrative geography and civil voting area ontology |
adms | | Asset Description Metadata Schema (ADMS) [VOCAB-ADMS] |
bag | | Dutch Government Base Registry Adressen en Gebouwen (BAG) |
dcat | | Data Catalog Vocabulary (DCAT) [VOCAB-DCAT-2] |
dcterms | | Dublin Core Metadata Initiative (DCMI) Metadata Terms [DCTERMS] |
dqv | | DWBP Data Quality Vocabulary (DQV) [VOCAB-DQV] |
foaf | | FOAF Vocabulary Specification |
geodcatap | | GeoDCAT-AP: A geospatial extension for the DCAT application profile for data portals in Europe [GeoDCAT-AP] |
geom | | Ordnance Survey'sGeometry Ontology |
geonames | | GeoNames Ontology |
georss | | GeoRSS :: Geographically Encoded Objects for RSS feeds [GeoRSS], Geo OWL encoding |
geosparql | | GeoSPARQL — A Geographic Query Language for RDF Data [GeoSPARQL] |
gml-ont | | GeoSPARQL — A Geographic Query Language for RDF Data: GML Concept Hierarchy |
ldqd | | DWBP Data Quality Vocabulary (DQV) [VOCAB-DQV]: Data quality categories and dimensions |
locn | | ISA Location Core Vocabulary [LOCN] |
osuk | | Ordnance Survey Linked Data Platform |
ov | | |
owl | | Web Ontology Language (OWL) [OWL2-OVERVIEW] |
qudt | | Quantities, Units, Dimensions and Data Types Ontologies (QUDT) |
rdf | | Resource Description Framework (RDF) [RDF11-PRIMER] |
rdfs | | RDF Schema vocabulary (RDFS) [RDF-SCHEMA] |
schema | | [SCHEMA-ORG] |
scotgov-stat | | STATISTICS.GOV.SCOT Geography Linked Data |
sdmx-attribute | | The RDF Data Cube Vocabulary [VOCAB-DATA-CUBE]: Attribute properties |
sf | | GeoSPARQL — A Geographic Query Language for RDF Data [GeoSPARQL] |
skos | | Simple Knowledge Organization System (SKOS) [SKOS-PRIMER] |
ukgov-stat | | Office for National Statistics Geography Linked Data |
vcard | | vCard Ontology — for describing People and Organizations [VCARD-RDF] |
void | | Describing Linked Datasets with the VoID Vocabulary [VoID] |
w3cgeo | | Basic Geo (WGS 84 lat/long) Vocabulary [W3C-BASIC-GEO] |
The following XML namespace prefixes are used within this document. Use of a namespace does not imply endorsement of the associated XML schema.
Prefix | Namespace IRI | Source |
gml | | Geography Markup Language (GML) Encoding Standard [GML] |
sam | | Observations and Measurements — XML Implementation [OM-XML] |
sams | | Observations and Measurements — XML Implementation [OM-XML] |
wml2 | | WaterML 2.0 Encoding Standard [WaterML] |
xlink | | XML Linking Language (XLink) Version 1.1 [XLINK11] |
Thefeature is the primary entity as described in spatial data standardsOpen Geospatial Consortium (OGC) and the 19100 series ofISO/TC 211Geographic information/Geomatics. [ISO-19101-1-2014] defines a feature as an: “abstraction of real-world phenomena”.
This terse definition is a little confusing, so let’s unpack it.
Firstly, it talks about “real world phenomena”; that’s everything from highways to helicopters, parking meters to postcode areas, water bodies to weather fronts, and more. These can be physical things that you can touch (e.g. a phone box) or an abstract concept that has spatialextent (e.g. a postcode area).Features can even be fictional (e.g. “Dickensian London”) and may even lack any concrete location information such as the mythical Atlantis.
The key point is that these “features” are things that one talks about in theuniverse of discourse — which is defined in [ISO-19101-1-2014] as the “view of the real or hypothetical world that includes everything of interest”.
Secondly, the definition offeature talks about “abstraction”. Take the example ofEddystone Lighthouse. A helicopter pilot might see it as a “vertical obstruction” and be interested in attributes such as its height and precise location. Whereas a sailor may see it as a “maritime navigation aid” and need information about its light characteristics and general location. Depending on one’s set of concerns, only a subset of the attributes of a given “real-world phenomenon” are relevant. In the case of Eddystone Lighthouse, we defined two separate “abstractions”. As is common practice in many information modelling activities, the common sets of attributes for a given “abstraction” are used to defineclasses. In the parlance of [ISO-19101-1-2014], such a class is known as “feature type”.
Although the exact semantics differ a little, there is a good correlation between the concept of “feature type” as defined in spatial data standards and the concept of “class” defined in [RDF-SCHEMA]. The former is an information modelling construct that binds a fixed set of attributes to an identified resource, whereas the latter defines the set of all resources that share the same group of attributes.
When combined with theopen-world assumption embraced by [RDF-SCHEMA] and the Web Ontology Language (OWL) [OWL2-OVERVIEW], the set-based approach to classes provides more flexibility when combining information from multiple sources. For example, the “Eddystone Lighthouse” resource can be seen asboth a “vertical obstruction”and a “maritime navigation aid” as it meets the criteria for membership of both sets. Conversely, this flexibility makes it much more difficult to build software applications as there is no guarantee that an information resource will specify a given attribute. Web standards such the Shapes Constraint Language [SHACL] are being defined to remedy this issue.
However, the term “feature” is also commonly used to mean a capability of a system, application or component. Also, in some domains and/or applications no distinction is made between "feature" and the corresponding real-world phenomena.
To avoid confusion, we adopt the term “Spatial Thing” throughout the remainder of this best practice document. “Spatial thing” is defined in [W3C-BASIC-GEO] as “Anything with spatial extent, i.e. size, shape, or position. e.g. people, places, bowling balls, as well as abstract areas like cubes”.
The concept of “Spatial Thing” is considered to includeboth "real-world phenomena"and their abstractions (e.g. “feature” as defined in [ISO-19101-1-2014]). Furthermore, we treat it as inclusive of other commonly used definitions; e.g.Feature from [NeoGeo], described as “A geographical feature, capable of holding spatial relations”.
ASpatial Thing may move. We must take care not to oversimplify our concept ofSpatial Thing by assuming that it is equivalent to definitions such asLocation (from [DCTERMS]) orPlace (from [SCHEMA-ORG]), which are respectively described as “A spatial region or named place” and "Entities that have a somewhat fixed, physical extension".
Looking more closely, it is important to note thatgeometry is typically a property of aSpatial Thing.
{ “geometry”: {"type":"Point","coordinates": [-4.268,50.184] }}
In fact, this is only onegeometry that may be used to describe Eddystone Lighthouse. Other geometries might include a 2D polygon that defines the footprint of the lighthouse in a horizontal plane and a 3D solid describing the volumetric shape of the lighthouse.
Furthermore, these geometries may be subject to change due to, say, a resurvey of the lighthouse. In such a situation, thegeometry object would be updated — but theSpatial Thing that we are talking about is still Eddystone Lighthouse. Following the best practices presented below, we use a HTTP URI to unambiguously identify Eddystone Lighthouse:
In fact, there are many URIs in use for Eddystone Lighthouse. The one above is provided by the owners/operators of the lighthouse; others include
fromDBPedia and
fromDeutsche Nationalbibliothek.
We say that theSpatial Thing is disjoint from thegeometry object. TheSpatial Thing, Eddystone Lighthouse (
), is the “real world phenomenon” about which we want to state facts (such as the height of its light is 41 meters above sea level) and link to other real world phenomena (for example, that it is located at Eddystone Rocks, Cornwall; anotherSpatial Thing identified as
SometimesSpatial Things, such asThe Sahara, have imprecisely defined locations. These are still considered to beSpatial Things as they have spatialextent — it's just that we can't define a crisp vector boundary for them because there's no consensus about where the edges are. In such cases, often a single point is given that provides the notional center-point of the Spatial Thing.
Although we have borrowed the description ofSpatial Thing from [W3C-BASIC-GEO], the formal [RDF-SCHEMA] definition ofw3cgeo:SpatialThing
doesn't quite suit our purpose as there is the potential for confusion about whether it isdisjoint fromgeometry. The definition ofgeosparql:Feature
, which is derived from the [ISO-19109] definition ofFeature, is a better semantic fit forSpatial Thing as it is explicitly specified as being disjoint fromgeosparql:Geometry
First, the domain ofw3cgeo:lat
properties isw3cgeo:SpatialThing
. While one could interpret these properties as mapping to ageometry, asGeoRSS Simple does, there isn't conclusive evidence that this is what was intended. Second,w3cgeo:Point
is defined as a sub-class ofw3cgeo:SpatialThing
. As a result, we have inconsistency in howw3cgeo:SpatialThing
may be interpreted. For example:
has lat/lon, some people might equatew3cgeo:SpatialThing
is defined as a sub-class ofw3cgeo:SpatialThing
, some other people find it natural to equatew3cgeo:SpatialThing
So in summary, it's safer to say that ourSpatial Thing equates togeosparql:Feature
, and that it isnot the same asw3cgeo:SpatialThing
Many aspects ofSpatial Things can be described with single-valued, static properties, e.g. the location of buildings, roads, administrative districts, and so on. However, in some applications it is more useful to describe how values of a property vary over space and time, mathematically described by “fields”. Such descriptions are formalized as "coverages".
So what is acoverage? As defined by [ISO-19123-1:2022], a coverage contains a set of values, each associated with one of the elements in an array of points or cells. It acts as a function to return values from its range for any direct position within its spatial, temporal or spatiotemporal domain. That is, thedomain
is the space and/or time over which the values of a property vary, with arange
of those values. Examples include raster data, point clouds, meshes such as triangulated irregular networks, and polygon sets. Coverages are multi-dimensional, including examples like 1D sensor timeseries, 2D satellite images, 3D x/y/t image timeseries and x/y/z geophysical voxel data, and 4D x/y/z/t climate and ocean data. Coordinate axes of such coverages can have spatial, temporal, or any other meaning, and they can be combined freely for n-dimensional coverages.
Some examples of coverages include:
An example is a postal region divided into many postal code zones. Any position in the region will have a postcode, but interpolating between zones is nonsensical. The range of values of the coverage is discrete, rather than continuous. Another coverage of the same postal region could be, for each zone, the number of houses, or all their addresses, or even all their geometrical footprints as polygons.
Another example is a satellite image or aerial photograph, using visible light, of an area of the earth. The boundary of the image encompasses thedomain
of the coverage, and the colour values in the image form therange
of that coverage. Then for any position within the image or photograph, a colour can be assigned, possibly by interpolation between the adjacent pixels of the image.
A satellite image of Amsterdam Central railway station
Another example is a point cloud: a set of points in three dimensional space with an optional set of supporting attributes such as intensity, color information, or time.
The digital elevation map for the Netherlands contains detailed and precise height data for the entire country, with on average eight elevation measurements per square meter.
A grid of "voxels", 3-dimensional is another example. These are sometimes used, for example, for geophysical data.
A voxel model using colour to show classes of stone in the near-surface underground.
An example that is not just spatial is a weather forecast that maps points in space and time to values of temperature, wind speed, humidity and so forth. An even less spatial and more purely temporal example is a time series of observations from a river gauge which maps points/instants in time to water-flow values.
So a coverage is a data structure with a method to map points in space and time to values of a property. One way to think of a coverage is as a mathematical function, where data values are a function of coordinates in space and time, but the values may not be just numbers but could be categories or even complex structures such as geometry objects.
Sometimes you will hear the word “coverage” used synonymously with “gridded data” or “raster data”, as in a satellite image, but this is not really accurate. You can see from the above paragraphs that non-gridded data (like a river gauge measurement, or post codes) can also be presented as coverages. Nevertheless, you will often find a bias toward gridded data in discussions (and software) that concern coverages.
Although the definition above presents acoverage as a data structure with a method, conceptually it still has spatialextent like anyfeature. For example, the distribution of rainfall measured by a weather radar can be thought of as a coverage — the spatial extent is defined by the limit of the weather radar's beam. Similarly, we might say in the hydrology example, where a river gauge measures water-flow values at regular sampling times, the spatial extent would be the monitoring point where the river gauge is positioned.
We say that acoverage is really just a special type ofSpatial Thing with some particular properties. Often, a coverage can be a property of anotherSpatial Thing; referring back to hydrology, a "river segment" may have a property “flow rate” that is expressed as a coverage.
Spatial Things andcoverages may be related in several ways:
Acoverage can be defined using three main pieces of information:
Usually, the most complex piece of information in thecoverage is the definition of the domain. This can vary quite widely as the examples above show. For this reason, coverages are often defined by the spatiotemporalgeometry of their domain. You will hear people talking about “multidimensional grid coverages” or “time-series coverages” or “vertical profile coverages” for example.
Aspatial relation specifies how an object is located in space in relation to a reference object. Commonly used types of spatial relations are: topological, directional and distance relations.
One of the most fundamental aspects of publishingspatial data, data about location, is how to express and share the location in a consistent way. In many cases where you are publishing data for use by the wider Web community the use oflatitude andlongitude coordinates (Lat and Long) is most appropriate. As latitude and longitude coordinates are global they are well suited to many applications: perfect for locating your favorite coffee shop,geocoding a photograph or capturing an augmented reality Pokemon hiding in your local park.
Users ofspatial data are often interested in the thirddimension too: vertical elevation (or altitude). For most situations, we can consider elevation to be the vertical distance above (or below) mean sea level. The elevation is most often expressed in meters (but this can vary betweenCRS definitions) and is provided as a third value in a coordinate position.
As with everything to do withspatial data, things can get more complicated. One of the most common problems occurs because not allCoordinate Reference Systems (CRS) agree on how to expresslatitude andlongitude coordinates. Some CRS order the coordinatesLat/Long while others useLong/Lat; some usedecimal degrees while others usedegrees, minutes and seconds (dms).Axis order mistakes can mean the difference between, say, a position in the Netherlands or somewhere in Somalia, while encoding coordinates indecimal degrees whendms is expected can lead to positional errors on the kilometer scale.
Therefore, it is very important to provide explicit information to your users about how coordinates are encoded. For example, this snippet of results from theGoogle GeocodingAPI makes explicit which is thelatitude and which is thelongitude coordinate.
"formatted_address" :"1600 Amphitheatre Parkway, Mountain View, CA 94043, USA","geometry" : {"location" : {"lat" :37.42248,"lng" : -122.08425 }, "location_type" :"ROOFTOP","viewport" : {"northeast" : {"lat" :37.42382,"lng" : -122.082901 }, "southwest" : {"lat" :37.42113,"lng" : -122.08560 } } },
Other mechanisms include using a data format that specifies how the coordinates are included (such as GeoJSON [RFC7946] where section4. Coordinate Reference System specifies coordinate order oflongitude andlatitude using units of decimal degrees) or by having your data explicitly reference theCRS definition you're using. SeeBest Practice 8: State how coordinate values are encoded for more information.
Now let's get a little more technical and discuss coordinate reference systems themselves.
Latitude,longitude and elevation measurements express a position on the surface of the Earth. But to define this position we need to state where we are making the measurements from (e.g. the equator, the prime meridian and the approximated surface of the Earth, orgeoid) and consider the shape of the earth (a flattened sphere with lumps and bumps, but for convenient mathematical operations, usually approximated to anellipsoid). This information is used to define the geodeticdatum which provides the basis of everycoordinate reference system.
Where yourgeospatial data hasgeometries defined as points, lines, and polygons (i.e.vector data), publishing in theWorld Geodetic System 1984 (WGS 84) Coordinate Reference System will help people to integrate data with mass-market Web applications, tools and libraries, thereby increasing the usefulness of that data for a large community of potential users. Also, since WGS 84 is also used by the GPS system, it's handy for all those mobile Apps!
Most people can stop reading now, but of course, there are cases where WGS 84 is not appropriate — for example, when working with geo-referenced imagery.
In many parts of the world location data has been collected using local coordinate systems that are specific to particular countries or regions. These local coordinate systems may use projected measurements defined on a flat, two-dimensional surface (which are easier to use for calculating distances than angular measurements and are essential when making topographic maps).
Users ofspatial data should be aware that projected coordinate systems distort distances and angular measurements and accordingly affect howthe true size of countries and other large-scale entities is perceived. CNN explore some of the challenges relating tomap projections in their articleWhat's the real size of Africa?.
So, it may be that you have information in aprojected CRS, rather than globallatitude andlongitude — what should you do? You can publish data 'as is' in one of these many projected CRS, but you need to tell users which particularCRS is being used. A good directory of Coordinate Reference Systems is maintained by the International Association of Oil and Gas Producers: theEPSG Geodetic Parameter Dataset.
It is common for aCRS to be described by its EPSG code. For example, 2-dimensional WGS 84 (Lat/Long) isEPSG:4326, 3-dimensional WGS 84 (Lat/Long/Elevation)EPSG:4979 and OSGB 1936 / British National Grid (a national projected CRS, based on the OSGB 1936datum) isEPSG:27700.
The authoritative source ofCRS definitions is theEPSG registry. Those definitions are also available from theOpen Geospatial Consortium CRS Register. Other Web services likeSpatial Reference also provide definitions as used in popular software, but those definitions — especially the ones published inWell Known Text (WKT) version 1 format — sometime differ from official EPSG definitions inaxis order and units of measurement.
You can re-project your coordinates to WGS 84 using many available tools online. So, for example, the location at516076, 170953
in British National Grid (EPSG:27700) coordinates is-0.331841, 51.425708
in WGS 84 Long/Lat. This conversion is a useful step as it makes your data more accessible to global users. So, if you can do so, it is helpful to publish data in both local (projected) and global coordinates.
However, given that satellite imagery is comprised of data pixels projected onto a flat surface (i.e.raster data), it is commonplace for raster-typespatial data to be expressed in aprojected coordinate reference system to avoid the unnecessary (and potentially costly) conversion of pixel positions to angular measurements. Web-Mercator (EPSG:3857), a globalprojected CRS, is used in the majority of Web-mapping applications and has therefore become the de facto Web-standardCRS for publishing raster data.
Re-projecting to a better-knownCRS is often a necessary step if you are publishing data in the form of engineering or Computer Aided Design (CAD) drawings of a new building or road layout for example. Usually these drawings are made using a very local coordinate reference system for the site itself, so the data will need to be reprojected to “fit” with existing data.
So, we are now at the point where almost everyone publishingspatial data on the Web can stop reading. But for those with specific requirements concerning high-precision locations, there are a few more topics that need to be mentioned.
If you need to be able to measure in terms of a few centimeters or less then things are more complicated. With this level of precision required, you need to consider a more sophisticated model of the shape of the Earth and consider plate tectonics.
For these more complex use cases other reference systems with alternative geodetic datums are used. The geodeticdatum can be thought of as the model of the Earth's surface over which thecoordinate reference system is applied. Different datums use different models for the preciseshape andsize of the Earth to provide more accurate horizontal or vertical measurements at different positions on the globe (because depending on your location, differentellipsoids will provide a better approximation of the local Earth's surface — but this is at the expense of a poorer match elsewhere).
While WGS 84 provides areasonable fit at all points on the Earth's surface, many other datums are defined for improved fit within a regional or national area. For example, in Europe a system called ETRS89 (EPSG:4258) can be used instead of WGS 84, while in North America a similar system called NAD-83 (EPSG:4269) is used. So, it might be that you have measurements made using these reference systems. Here the best practice is once more to be explicit in describing theCRS used, but also to be careful re-projecting to different systems as required accuracy may be lost.
Finally, another issue is that points on the surface of the earth are actually moving relative to the coordinate system, due to geologic processes. You may think this is of interest only to geologists, but when I tell you that Australia has moved around 1.5m since the framework was last reset 20 years ago, and remind you that we are entering the age of self-driving cars, then you will probably think again. Re-calculating the datum from time to time, or maybe continuously such as in the case of the dynamic New Zealand Geodetic Datum (NZGD2000), really does matter for some applications. SeeBest Practice 7: Choose coordinate reference systems to suit your user's applications for more information.
The term ‘Linked Data’ refers to an approach to publishing data that puts linking at the heart of the notion of data, and uses the linking technologies provided by the Web to enable the weaving of a globally distributed database. By identifying real-world entities — be they Web resources, physical objects such as the Eiffel Tower, or even more abstract things such as relations or concepts — with URLs, data can be published and linked in the same way Web pages can [LDP-PRIMER].
The 5-star scheme at5 Star Data states:
★ make your stuff available on the Web (whatever format) under an open license
★★ make it available as structured data (e.g., Excel instead of an image scan of a table)
★★★ make it available in a non-proprietary open format (e.g., CSV as well as Excel)
★★★★ use URIs to denote things, so that people can point at your stuff
★★★★★ link your data to other data to provide context
We think that the concept ofLinked Data is fundamental to the publishing of spatial data on the Web: it is thelinks that connect data together that are the foundational to the Web of data.
These best practices promote a Linked Data approach.
Sources such as the Best Practices for Publishing Linked Data [LD-BP] assert a strong association betweenLinked Data and theResource Description Framework (RDF) [RDF11-PRIMER]. Yet we believe that Linked Data requires only that the formats used to publish data support Web linking (see [WEBARCH]section 4.4 Hypertext).5 Star Data (based on [5STAR-LOD]) asserts only that data formats are open and non-proprietary (★★★); and infers the need for data formats to support the use of URIs as identifiers (★★★★) and Web linking (★★★★★).
In fact, our approach tolinked data is well described by an alternative 5-star scheme from [WEB-DATA]:
★Linkable: use stable and discoverable global identifiers
★★Parseable: use standardized data metamodels such asCSV [RFC4180],XML [XML11],RDF [RDF11-PRIMER], orJSON [RFC7159].
★★★Understandable: use well-known or at least well-documented vocabularies/schemas
★★★★Linked: link to other resources whenever possible
★★★★★Usable: label your document with a license
Within this document, we include examples that useRDF and related technologies such astriple stores andSPARQL [SPARQL11-OVERVIEW] because we see evidence of its use in real world applications that supportLinked Data. However, we must make clear to readers that there is no requirement for all publishers ofspatial data on the Web to embrace the wider suite of technologies associated with theSemantic Web; we recognize that in many cases, a Web developer has little or no interest in the toolchains associated with Semantic Web due to its addition of complexity to any Web-centric solution.
Although we think thatLinked Data need not necessarily require the use ofRDF, it is probably the most common representation. We note that [JSON-LD] provides a bridge between those worlds by defining a data format that is compatible with RDF but relies on standardJSON tooling.
Furthermore, as the examples in this document illustrate, we often see a ‘hybrid’ approach being used in real-world applications; usingRDF to work with graphs of information that interlink resources, while relying for performance reasons on other technologies to query and process the spatial aspects of that information.
Finding, accessing and using data disseminated throughspatial data infrastructures (SDI) based onOGC Web services is difficult for non-expert users. There are several reasons, including:
However,spatial data infrastructures are a key component of the broaderspatial data ecosystem. Such infrastructures typically include policies, workflows and tools related to the management and curation of spatial datasets, and provide mechanisms to support the rich set of capabilities required by the expert community. Our goal is to help spatial data publishers build on these foundations to enable the spatial data fromSDIs to be fully integrated with the Web of data.
When your starting point is aspatial data infrastructure, you should at least read the following best practices. These provide the most important extra steps that should be taken to bringspatial data fromspatial data infrastructures to the Web:
The rest of the best practices provide more detail on specific aspects of publishingspatial data on the Web, such as metadata,geometries,CRS information, versioned data, and so on.
Raised during discussion of WebVMT at OGC Toulouse TC with@rjksmith
WebVMT files provide map presentation and annotation synchronised to video content, including animation support, and more generally any form of geolocation data that is time-aligned with audio or video content.
This creates a potential issue of facial recognition combined with geospatial information knowing exactly who and where someone is. Therefore should the best practices talk about how geo should be integrated into the web in an ethical way?
Spatial data, like any other data, should be published on the Web. By this we mean more than providing spatial data file downloads or services; for data to beon the Web, the resources it describes need to be identified using HTTP URIs, be published in such a way that they are indexable by search engines, and be connected, or linked, to other resources. This makes the data easy to find and easy to access for non-specialist users: the spatial data becomes integrated within the wider Web of data.
As a first step in publishing yourspatial data on the Web, you should assign a URI to each of your datasets (see [DWBP]Best Practice 9: Use persistent URIs as identifiers of datasets).
Deciding whether yourspatial data is a single dataset or not is somewhat arbitrary. To decide this, it is often useful to consider attributes such as the license under which the data will be made available, the refresh or publication schedules, the quality of the data and the governance regime applied in managing the data. Typically, all of these attributes should be consistent within a single dataset.
[VOCAB-DCAT-2] provides a useful definition of the dataset that supports this approach: “A collection of data, published or curated by a single agent, and available for access or download in one or more representations.”
However, we need to look inside the datasets at the resources described within your data. If you want these resources to be visible within the Web’s information space, by which we mean that others can refer to or talk about those resources, then they must also be assigned URIs (see [DWBP]Best Practice 10: Use persistent URIs as identifiers within datasets). These URIs are like 'Web-scale foreign keys' that enable information from different sources to be stitched together.
The primary topics of any spatial dataset areSpatial Things — anything from physical things like people, places, and post boxes to abstractions such as administrative areas. EachSpatial Thing will be described by a set of attributes and usually at least onegeometry. How yourspatial data is structured will depend on the vocabulary or data model you use (see13.2.1Spatial data encoding for further details on vocabulary choice). This will determine the types of entities that, along with theSpatial Things themselves, are important enough to be given identifiers so that statements can be made about them. Geometry objects are an example of an entity that is often assigned a unique identifier so that they can be referenced or reused.
Given the widespread use of the Hyper Text Transfer Protocol (HTTP) on the Web, weSHOULD use HTTP URIs to identify resources inspatial data.
This is a fundamentally different approach to that of typical data publication today — where the dataset is (often) globally identified, but individualSpatial Things ( "features" inSDI parlance), are assigned local identifiers which may, or may not, be persistent.
We consider identifiers in the Web’s information space to be unaffected by the choice to serve HTTP content securely or not. For example,
both identify the sameSpatial Thing — in this case the South American country of Suriname.
Use stable HTTP URIs to identify Spatial Things, re-using commonly used URIs where they exist and it is appropriate to do so.
To publishspatial data on the Web, we need to stitch theSpatial Things and their corresponding entities into the Web’s information space; contributing to theWeb of data. First: [WEBARCH]Good Practice: Identify with URIs states that "agents should provide URIs as identifiers for resources". Second: the5 Star Data scheme states: "★★★★ use URIs to denote things, so that people can point at your stuff".
Resources identified with HTTP URIs can be specified as the target oflinks within the Web’s global information space, enabling information to be related, combined and referred to. This is the fundamental basis of 5★ Linked Data: "★★★★★ link your data to other data to provide context".
The HTTP URIs used to identifySpatial Things need to be stable or persistent so that relationships that link them to other resources don’t break.
Spatial Things become part of the Web’s global information space enabling them to be linked with otherSpatial Things and other resources and for thoselinks to be durable. In other words,spatial data becomes part of the Web of Data.
Your data is more findable.
[DWBP]Best Practice 10: Use persistent URIs as identifiers within datasets provides directly applicable guidance when identifying resources. It advises:
However, we need to look a little more closely at how and where to apply that guidance.
The Web of data is made up ofsubjects andobjects; the things we talk about and the things we refer to. For example, we could say thatAnne Frank's House (thesubject) is withinthe Municipality of Amsterdam (theobject). InRDF [RDF11-PRIMER], this looks like:
When considering HTTP URIs forobjects (e.g. the target of our hyperlinks) it makes sense to reuse existing identifiers. After all, youare trying to stitch yourspatial data into the Web so that we can "link your data to other data" and achieve a ★★★★★ rating! Organizations such asDBPedia,GeoNames and government mapping and cadastral authorities (that publish national registers of addresses, buildings, etc.) are good sources of stable, authoritative URIs. The steps described fordiscovering existing vocabularies [LD-BP] can be readily adapted to find more. For more details about how you might link to these authoritative identifiers, see13.1.3Linking data.
However, HTTP URIs forsubjects (e.g. the resource that we want to make statements about) can be trickier. If you are working purely with data then you can reuse existing URIs minted by other authorities for yoursubject URIs. But publishingspatial data on the Web means that the URIs for eachSpatial Thing should dereference to Web pages or data resources that provide useful information (seeBest Practice 2: Make your spatial data indexable by search engines). An HTTP request will be directed to a host Web server, identified by the internet domain name (or IP address) in the requested URI. If you use a URI with an internet domain name where you have no control over how the Web server behaves, then there is no way for your statements to be included in the Web server's response.
To take control of how information aboutSpatial Things is presented, data publishers need to assign their subject Spatial Things HTTP URIs from an internet domain name where they have authority over how the Web server responds. Typically, this means minting new HTTP URIs. It's all worth considering that the use of a particular internet domain may reinforce the authority of the information served. For example, a URI for Anne Frank's House is:
. The use of the internet domain registered to theCultural Heritage Agency of the Netherlands gives the definition authenticity.
The need to control what information is provided about a givenSpatial Thing means that it is not uncommon for a Spatial Thing to be identified by multiple HTTP URIs. The equality between two URIs that refer to the same resource can be stated using a property such asowl:sameAs
. Care must always be taken when usingowl:sameAs
to determine that the two URIs actually refer to the same resource, rather than two resources that are similar. Warning: don't say it if you're not sure it's true!
For more information about the types of properties that can be used to link betweenSpatial Things, and between Spatial Things and other resources, see13.1.3Linking data.
When minting your own URIs, [DWBP]Best Practice 10: Use persistent URIs as identifiers within datasets cites the advice from GS1's SmartSearch Implementation Guideline [GS1] which suggests that your URIs should include the type of resource that is being identified to help human readability. Also, given the need for the HTTP URIs forSpatial Things to be used throughout their lifetime (and perhaps beyond) you should give some thought to designing a URI that is persistent.
This URI identifies the Amsterdam Central train station:
This URI was minted using the recommendations in the Dutch URI strategy. Although minted by the Kadaster, they chose to use the domain ‘’ (which translates to ‘base’) because this is expected to be a more persistent name than ‘’. Even though the Kadaster is over a 100-years old, organization names are not considered persistent in general as organizations may merge or their names may change. ‘top10nl’ is the name of the dataset, and ‘gebouw’ means ‘building’ – giving the human reader of this URI a clue of what is being identified. The last part of the URI is the building number from the dataset.
[DWBP]Best Practice 9: Use persistent URIs as identifiers of datasets cites the European Commission's Study on Persistent URIs [PURI] as a good source from which to gain insights about designing persistent URIs.
When an HTTP URI is dereferenced, the server will respond with a sequence of bytes: by its nature, HTTP can only serveinformation resources such as Web pages orJSON [RFC7159] documents. Yet aSpatial Thing is actually a real or conceptual phenomenon — a lake is made from water not information! Using a single URI to refer toboth the Spatial Thing and the page/document that describes the Spatial Thing introduces a URI collision. This can impose a cost in communication due to the effort required to resolve ambiguities. [URLs-in-data] has more to say on this subject, including recommending URI design patterns that enable differentiation between the Spatial Thing and the page/document that describes it.
However, in most cases using a single URI for bothSpatial Thing and the page/document is simpler to implement and meets the expectations of most end-users. As stated in [WEBARCH]section 2.2.3 Indirect Identification, identifiers are commonly used in this way. There is no obligation to distinguish between the Spatial Thing and the page/document unless your application requires this.
While there is a cost to this conflation, problems can be mitigated by avoiding making statements that confuseSpatial Thing and the page/document, such as “Uluru is available in KML format”; e.g.<
> dcterms:hasFormat <
> .
This statement is clearly not true; an ancient monolith covering more than 3 km2 cannot be provided inXML [XML11]!
HTTP URIs forSpatial Things should not include any indication of the data format used to encode the page/document as this may change as your systems evolve. That said, you may wish to provide a set of complementary resources that specify a particular format as part of your content negotiation strategy. For example, the URI
dereferences to provide an RDF/XML encoding of the information about Uluru in the Northern Territory of Australia (
[DWBP]Best Practice 10: Use persistent URIs as identifiers within datasets notes that URIs can be long. You may need to define identifiers that are locally unique within your spatial dataset and provide a mechanism to programmatically convert each local identifier to a URI. For example, the Metadata Vocabulary for Tabular Data [TABULAR-METADATA] achieves this using URI Templates as described in [RFC6570].
It is also good practice to use a redirection service to hide complex and potentially changing service end-point URLs, such as for anOGCAPI Features [OAF1] behind well-designed URIs. This means that users don’t need to be aware of the complexities of theAPI or changes in endpoint URIs orAPI versions to request information about a particularSpatial Thing. For example, the URI
could be used as proxy for theOGCAPI Features request
Finally, while it is simple to use a query-pattern URL to serve information about a resource identified with a URI from a third-party internet domain, e.g.
, these URLs are unsuitable as persistent identifiers. More often than not, your intended users will dereference the "official" URI, e.g.
. That said, this kind of search operation does provide a useful mechanism to find particularSpatial Things. SeeBest Practice 13: Expose spatial data through 'convenienceAPIs' for further details.
Check that within the dataSpatial Things, such as countries, regions and people, are referred to by HTTP URIs or by short identifiers that can be converted to HTTP URIs. Ideally dereferencing the URIs should return theSpatial Thing, however, they have value as globally scoped variables whether they dereference or not.
Relevant requirements:R-Linkability,R-GeoReferencedData,R-IndependenceOnReferenceSystems.
Search engines are the common starting point for people looking for content on the Web. However, as far as search engines are concerned, something is only 'on the Web' if it has an HTTP URI and when this URI is dereferenced, information is returned (usually in the form of a Web page).
Search engines should be able to crawl spatial data on the Web and index Spatial Things for direct discovery by users.
In both theGNAF Best Practice implementation report andNRW Best Practice implementation report, an issue has been encountered in relation toBest Practice 2: Make your spatial data indexable by search engines namely that there are millions of Spatial Things in each dataset and whilst the implementations create machine readable and indexable data for each Spatial Thing, the use of pagination to make the Spatial Things navigable for humans seems to impact the indexability.
In discussions in Lyon there was also some questions about the demonstrating the value of having each individual Spatial Thing indexable - however I believe there was reasonable consensus that allowing users to find a specific Spatial Thing and for machines to be able to create links between datasets containing information about specific Spatial Things was seen as having clear value - although the description of the value of the linking between spatial things or data about the same spatial thing could be improved in Best Practice 2.
I believe@cportele is going to look into the sitemaps approached discussed in Best Practice 2. We may also benefit from discussion on this issue from Ed Parsons.
InSDIs information about spatial datasets is published as authoritative metadata records and collated in Web-based catalogues. This approach causes several problems:
Search engines are the common starting point for people looking for content on the Web that is widely understood. By publishingspatial data in a way that enables their crawlers to index spatial datasets including eachSpatial Thing, the fidelity of search results should improve. Users will be able to directly search for specific entities rather than having to look for a dataset and then parse through it; e.g. to search for "Anne Frank’s House" (
) rather than looking for a dataset about "Cultural Heritage in Amsterdam" and hoping that it contains a reference to what you’re interested in.
At present, spatial information is not widely exploited by search engines. However, by increasing the volume of spatial information presented to search engines, and the consistency with which it is provided, we expect search engines to begin offering spatial search functions. We already see evidence of this in the form of contextual search, such as prioritization of search results from nearby entities. In addition, search engines are beginning to offer more structured, custom searches that return only results that include certain [SCHEMA-ORG] types, likeDataset,Place orCity.
Information about spatial datasets and things is indexed by search engines.
Users can findSpatial Things using common search engines.
That is, your data is more findable.
In general, you need to:
The Web-page for the dataset is an entry point for humans to browse and for the search engines to crawl your data. This landing page should provide descriptive metadata that helps users evaluate whether the dataset meets their needs (seeBest Practice 15: Include spatial metadata in dataset metadata and [DWBP]Best Practice 2: Provide descriptive metadata), and may providelinks to other service end-points,APIs or tools that will help a user work with the dataset. When metadata for datasets has already been created, e.g. to create a record in a metadata catalogue or to describe the data available from a service end-point, this information should be re-used — publishing it in a Web-friendly way that humans and Web-crawlers can consume. The landing page should be indexable by the search engines so that it can be discovered too!
To enable humans and Web-crawlers to find HTML pages for theSpatial Things, the "landing page" needs to include hyperlinks that can be followed. Where you have a larger collection of Spatial Things, you should support paging through the collection.
You may also consider usingSitemaps to direct the Web-crawler. For larger datasets, multiple sitemaps can be provided and grouped by a sitemap index file. If a dataset contains millions ofSpatial Things (e.g. a building dataset with national extent), generating and maintaining the sitemaps may require a custom implementation to keep the sitemaps with the set of Spatial Things synchronized.
For very large datasets paging through thousands of pages is not useful for a human either. Consider supporting filtering and/or organize theSpatial Things into subsets, as described in13.4Spatial data access.
In case of an address dataset, you could organize theSpatial Things (the addresses) by municipality, postcode and street name to support a human user to get to a building with a few clicks.
A pre-condition for this best practice isBest Practice 1: Use globally unique persistent HTTP URIs for Spatial Things as persistent identifiers are essential to support reliable indexing and linking. Traditionally spatial datasets have not been maintained with stable identifiers forSpatial Things, but to sharespatial data on the Web stable identifiers are a must. Sharing spatial data is more than "just" making the dataset available on the Web.
Each Web-page, and the hyperlinks used to relate theSpatial Things to the dataset landing page, can likely be generated programmatically from the data you hold about the Spatial Thing, either directly from the data or by using anAPI that makes the data available on the Web.
Possible implementation approaches for addressing this best practice in the context of an existingSDI are discussed in more detail inBest Practice 13: Expose spatial data through 'convenienceAPIs' for additional information. For example, by using a proxy tool likeldproxy or by mapping the data in theSDI dynamically to crawlable resources on the Web using the [R2RML] standard and Linked Data Publication tools. Both approaches generate crawlable data fromSpatial Things in your spatial datasets at query time and allow to enrich the data on the Web with additional information andlinks.
It is important to keep in mind that the HTML representations should not mainly be designed for search engines, but they should present the data in a clear and understandable way to human users. The page about theSpatial Thing should be useful to a user and encourage others to link to the page when they share other information about the Spatial Thing. This typically will also improve the ranking of these pages in search results.
TheNanaimo Parks Search, Canada provides a landing page and one page per park. The landing page offers a search capability and the option to select from a map. This data is indexed; a search for, for example, "Planta Park, Nanaimo" in a popular search engine returns the Nanaimo data for thisSpatial Thing as one of the first results.
TheBathing Water Quality Explorer for England provides a landing page and one page per site. Sites can be searched, selected from a list or in a map.
In both cases, the pages of theSpatial Things are generated from the underlying data at request time.
The property Web-pages in Nanaimo also use [MICRODATA] annotations using [SCHEMA-ORG], which is discussed below.
In addition to exposing thespatial data as linked HTML Web-pages, indexing by Web-engines can be further enhanced by incorporating a description of theSpatial Thing as structured markup (in particular [MICRODATA] or [JSON-LD] annotations using [SCHEMA-ORG]) as this enables the search engines to make more detailed assumptions about your resource. It is important to note that this is not only helpful to search engines, but also to other tools that want to understand more about the semantics of the resource, for example, its location.
In [SCHEMA-ORG], a spatial dataset is aDataset and aSpatial Thing is in general aPlace or anEvent. For some types of Spatial Things, more specific sub-types exist, for exampleCity orMountain.
Location information about aSpatial Thing is typically provided using ageometry (GeoCoordinates orGeoShape) or aPostalAddress. [SCHEMA-ORG] coordinates are restricted to WGS 84 withlongitude andlatitude. Supported geometry types are points, line strings, polygons, boxes and circles.
By using [SCHEMA-ORG] annotations, search engines, and others can connect location information with other information, e.g. about the nature of theSpatial Thing, opening hours, contact details, etc.
The use of [SCHEMA-ORG] forspatial data is in its early days and should be understood as an "emerging practice".
This code-snippet illustrates a [JSON-LD] annotation using a [SCHEMA-ORG]Dataset for an address dataset in the Netherlands that may be embedded in the HTML of the Web-page. It includes a name, a description, the spatial extent using a bounding box, the URL of the Web-page, and alink to another dataset containing this dataset. The same annotation could also be provided using [MICRODATA], but we use [JSON-LD] here as this presents the structured data in a more human-readable way.
<scripttype="application/ld+json">{"@context" : {"@vocab" :"" },"@type" :"Dataset","@id" :"","name" :"Adressen","description" :"INSPIRE Adressen afkomstig uit de basisregistratie Adressen, beschikbaar voor heel Nederland","url" :"","isPartOf" : {"@type" :"Dataset","url" :"" },"keywords" :"Adressen","spatialCoverage" : {"@type" :"Place","geo" : {"@type" :"GeoShape","box" :"47.975,3.053 53.504,7.24" } }}</script>
This code-snippet illustrates a [JSON-LD] annotation using a [SCHEMA-ORG]Place for the address of the "Anne Frank’s House" in that dataset. It includes the location, the URL of the Web-page, and the structured postal address information.
<scripttype="application/ld+json">{"@context" : {"@vocab" :"" },"@type" :"Place","@id" :"","url" :"","geo" : {"@type" :"GeoCoordinates","longitude" :"4.88399","latitude" :"52.37520" },"name":"Anne Franks House","description":"Museum house where Anne Frank & her family hid from the Nazis in a secret annex, during WWII.","address" : {"@type" :"PostalAddress","streetAddress" :"Prinsengracht 267","addressLocality" :"Amsterdam","postalCode" :"1016GV" }}</script>
The Web-pages should also provide a mechanism to download data in the formats you decide to support. [DWBP]Best Practice 14: Provide data in multiple formats provides guidance.
Typically, multiple formats for a resource are supported using two mechanisms: HTTP content negotiation and by adding format-specific file extensions to the resource URI like ".json
", ".xml
" or ".ttl
". Content negotiation is the standard mechanism of HTTP and the format-specific URIs enable the use of clickablelinks to the resource in a specific format.
Search engines may also index resource representations in other formats than HTML.
At the time of writing, Google is indexing [KML] documents and supporting advanced searches that are restricted to KML documents. [GML] files are also indexed, but only like any otherXML [XML11] documents.JSON [RFC7159], including GeoJSON [RFC7946], is currently not indexed.
In 2016, these topics were analyzed in a testbed organized by Geonovum in the Netherlands. More details can be found in reports from the testbed:Spatial Data on the Web using the currentSDI andCrawlable geospatial data using the ecosystem of the Web and Linked Data.
The use of [SCHEMA-ORG] for describing spatial information has been also investigated in two studies, concerning,the former, the definitions of mappings between [LOCN], [VCARD-RDF] and [SCHEMA-ORG], and,the latter, the definitions of mappings between [GeoDCAT-AP] and [SCHEMA-ORG].
The use of [SCHEMA-ORG] for describing spatial information is continually evolving; spatial data publishers should familiarize themselves with current practices. A usefulIntroduction to Structured Data is provided in Google's developer portal.
Using a Web browser,
Monitor the search consoles of the search engines about the progress in indexing your Web-pages and their structured data. In case any errors are reported, try to fix them.
Relevant requirements:R-BoundingBoxCentroid,R-Crawlability,R-Discoverability,R-Linkability,R-MachineToMachine.
Links, in whatever machine-readable form, are important. In the wider Web, it is links that enable the discovery of Web pages: from user-agents following a hyperlink to find related information to search engines using links to prioritize and refine search results. This section is concerned with the creation and use of those links to support discovery of theSpatial Things described in spatial datasets.
For data to beon the Web, the resources it describes need to be connected, orlinked, to other resources. Theconnectedness of data is one of the fundamentals of theLinked Data approach that these best practices build upon.
Just like any type of data,spatial data benefits massively from linking when publishing on the Web. The widespread use oflinks within data is regarded as one of the most significant departures from contemporary practices used withinSDIs. That's why this topic is included in this Best Practice.
[DWBP] identifiesLinkability as one of thebenefits gained from implementing the Data on the Web best practices (see [DWBP]section 8.7 Data IdentifiersBest Practice 9: Use persistent URIs as identifiers of datasets andBest Practice 10: Use persistent URIs as identifiers within datasets). However, no discussion is provided about how to createlinks that can make use of those persistent URIs. This section of the document extends [DWBP] by providing a best practice aboutcreating links between the resources described inside spatial datasets.
Bind Spatial Things into the Web of data usinglinks to other resources, providing sufficient information for a user to determine whether the target resource specified in a link will be of use.
The5★ rating for Linked Open Data asserts that to achieve the fifth star you must "link your data to other data to provide context". The benefits for consumers and publishers of linking to other data arelisted as:
There is always a cost to traversal of alink, even if it is just a few milliseconds delay and the need to parse a few hundred or thousand bytes returned in response to an HTTP request. In many cases, such as when dealing with large datasets and complex queries, the costs incurred from traversing a link may be significant in terms of time and data volumes. Before a user or software agent decides to traverse a link, they should be able to determine whether acquisition of the target resource, or dataabout the target resource, will support their application goals. For example, what format can one expect the response in, what type of resource is the target and how is that target related to the source resource?
Links can be identified and traversed by humans and software agents.
Sufficient information is provided to help humans and software agents determine whether the traversal of a given link meets their goals.
Your data is more interoperable.
The ground-rules for linkingspatial data are the same as forany type of data.
Use formats that support Web linking (as defined in [WEBARCH]section 4.4 Hypertext)
Earlier in this document (11.Linked Data) we explained thatlinked data requires only that the formats used to publish data support Web linking. In other words, linkingspatial data does not automatically mean the use ofRDF [RDF11-PRIMER];links can also be created, for example, using [GML], HTML or [JSON-LD]. The two key points from [WEBARCH] are:
The examples used in this best practice illustrate some of the data formats and mechanisms that support Web linking.
Follow the principles for4★ — Linked [WEB-DATA]
Always use global identifiers when linking between documents, so thatlink identifiers can be taken out of context and shared globally.
Links should be typed (explicitly or implicitly) so that clients can decide which link to follow when they are traversing a Web of interlinked resources to reach application goals.
HTTP/1.1200 OKLink:<>; rel="predecessor-version"Content-type:application/geo+jsonConnection:close{...}
This example, using HTTP Link headers (as defined in [RFC5988]), illustrates the use of IANA [IANA-RELATIONS] to define thelink type. According to the IANA registry,predecessor-version
points to a resource containing the predecessor version in the version history (as defined in [RFC5829] "Link Relation Types for Simple Version Navigation between Web Resources").
In simplelinks involving only two resources, therole, or type, of each resource are implicit and can be inferred from the link relation type. It can be useful to include other information to help users judge whether to follow a link such as human-readable labels and hints about the target resource type. Of course, often target resources are maintained by different parties, so information provided with the links that refer to them may or may not turn out to be true when the link is traversed. For example, [RFC5988] "Web Linking" defines several additional attributes including:hreflang
— hints at the language or languages that the target resource is available in;type
— indicates the media-type expected; andtitle
— labels the link target such that it can be used as a human-readable identifier etc.
Also note that [DWBP]Best Practice 19: Use content negotiation for serving data available in multiple formats recommends the use ofcontent negotiation to help ensure that a user or software agent is provided with useful content when they traverse alink and dereference to the target resource. However, HTTP Request headers are limited to specifying media type, character set, encoding (e.g. for compression), and language. There is no mechanism to request that data is provided according to a particular data model or 'profile', nor request data in a particularcoordinate reference system. This gap in current practice is discussed in15.1Requesting different representations of geometries.
Makelinks as specific as possible. If the linked resource supports fragment identification, and the link logically should be to a fragment of the resource (and not just the resource as a whole), try to use fragment identifiers when possible.
Being as specific as possible with links is important; e.g. refer to a particularSpatial Thing rather than the dataset in which thatSpatial Thing is described. That said, we encourage publication of data about Spatial Things as independently resolvable resources (e.g. so that they can be accessed by search engine's Web crawlers, seeBest Practice 2: Make your spatial data indexable by search engines) which means that fragment identifiers are usually not required.
Check that hyperlinks are distinguishable within the data — a string-literal that happens to contain a URL is insufficient.
Check that hyperlinks use global identifiers, preferably HTTP URIs, to identify the link target.
Check that hyperlinks use typed relationships, and that the definition of the link relation type can be located in order to determine how to interpret the hyperlink.
Relevant requirements:R-Linkability,R-MachineToMachine.
The best practices in this section take [DWBP] as a basis and further refine them to provide more specific guidance forspatial data.
This section does not elaborate on formats for publishingspatial data on the Web. The formats are basically the same as for publishing any other data on the Web:XML [XML11],JSON [RFC7159],CSV [RFC4180],RDF [RDF11-PRIMER], etc. Refer to [DWBP]section 8.8 Data Formats for more information and best practices. Refer toA.Applicability of common formats to implementation of best practices for a list of spatial data formats for the Web.
That being said, it is important to publish yourspatial data with clear semantics, i.e. to provide information about the contents of your data. The primary use case for this is you have information about a collection ofSpatial Things and you want to publish precise information about their attributes and how they are interrelated. Another use case is the publication on the Web of a dataset that has a spatial component in a form that search engines will understand.
Depending on the format you use, the semantics may already be described in some form. For example, in GeoJSON [RFC7946] this description is present in the specification. When usingJSON it is possible to add semantics using a [JSON-LD]@context
object. For providing semantics to search engines, using [SCHEMA-ORG] is a good option, as explained inBest Practice 2: Make your spatial data indexable by search engines. In alinked data setting, the attributes of aSpatial Thing can be described using existing vocabularies, where each term has a published definition. If you can't find a suitable existing vocabulary term, you should create your own, and publish a clear definition for the new term, linking it to commonly used existing ones if possible, because this increases its usefulness. An overview and high-level comparison ofRDF vocabularies / OWL ontologies forspatial data is provided inA.Applicability of common formats to implementation of best practices. We do not recommend one vocabulary because this recommendation would not remain durable as vocabularies are released or amended.
[DWBP]section 8.9 Data Vocabularies provides guidance on the topic of data modelling; determining which concepts and relationships should be used to describe your area of interest, something usually done bydomain experts. Data publishers should not attempt to guessall the purposes for which someone might use or reference their data — ending up with a super-complex data model that tries to cover every possible use case. Instead, data publishers should try to help data consumers make informed decisions about the best way to use the data by providing good metadata.
In most cases, the effective use of information resources requires understanding thematic concepts in addition to the spatial ones; "spatial" is just a facet of the broader information space. For example, when theDutch Fire Service responded to an incident at a daycare center, they needed to evacuate the children. In this case, the2nd closest alternative day care center was preferred because it was operated by the same organization as the one that was subject of the incident, and they knew who all the children were.
This best practice document provides mechanisms for determining how places and locations are related — but determining the compatibility or validity of thematic data elements is beyond our scope; we're not attempting to solve the problem of different views on the same/similar resources.
That said, there is one aspect of thematic semantics that must be mentioned. The most important semantic statement you can make when publishingspatial data — or any data — is to specify thetype of a resource. ForSpatial Things, there are several types that define "spatialness" (for examples in alinked data context, seethe vocabularies table inA.Applicability of common formats to implementation of best practices). But you should also consider non-spatial aspects when designating the type of aSpatial Thing. For example, should a fire incident occur at Amsterdam Central railway station, it might seem sensible for the Municipal Fire Department to designate a type such asBuilding orStation (the Dutch Government Base Registry defines Amsterdam Central railway station, identified as
, designates both of these types). However, the Fire Departments are concerned with afire incident — not the railway station itself. The fire incident is aSpatial Thing (it has spatialextent) but it isnot the station. For example, the fire may spread to adjacent buildings. The Fire Department might designate theirSpatial Thing as having typeFireIncident or similar. Advice on how to assign a persistent identifier to the fire incident is provided inBest Practice 1: Use globally unique persistent HTTP URIs for Spatial Things, and13.1.3Linking data provides guidance on how one might relate the fire incident to other coincident Spatial Things such as Amsterdam Central railway station.
Thematic semantics are out of scope for this best practice document. For associated best practices, please refer to [DWBP]section 8.2 Metadata,Best Practice 3: Provide structural metadata; and [DWBP]section 8.9 Data Vocabularies,Best Practice 15: Reuse vocabularies, preferably standardized ones andBest Practice 16: Choose the right formalization level.
See also [LD-BP]Vocabularies.
Represent spatial data in a way that matches the needs of the target audiences.
Spatial data is used by a range of user communities, each with their own purposes, knowledge and preferred tools. Data publishers should consider which communities and purposes they want to serve and make appropriate choices for the approach to encoding data. In general terms, data usefulness is increased when it can be used for more purposes. This might involve providing data in several different formats. (See [DWBP]Best Practice 14: Provide data in multiple formats.)
Spatial data can be used easily and reliably by the target users.
Your data is more interoperable.
A high-level objective of these best practices is to highlight approaches that data publishers can take to maximize the ease of use of theirspatial data via the Web and hence present data in a way that meets the needs of as wide a range of users and applications as possible.
One way of classifying the applications of spatial data is as follows:
Each of these has different needs: often it will be possible or desirable to support several of these application groups.
The main objective is to encode data in a way that recipients can easily decode and understand. To decide this, you need to consider which purpose(s) and which audience(s) you are aiming to serve and the characteristics of the data that you want to share. For example:
InBest Practice 1: Use globally unique persistent HTTP URIs for Spatial Things, we recommend the use of HTTP URIs as a way of assigning identifiers toSpatial Things. The data publisher should offer the ability to look up ('dereference') such a URI to find out useful information about thatSpatial Thing in human readable form (as well as machine readable formats — see the discussion below on data integration). EachSpatial Thing therefore gets its own Web page — in addition it might be useful to have Web pages about groups ofSpatial Things, but the 'page per thing' approach enables fine-grained linking of information.
To promote the discovery of such Web pages in search engines, each page should contain a clear text description of what it is, ideally in a way that distinguishes it from pages about other similarSpatial Things. Including metadata using the [SCHEMA-ORG] vocabulary, embedded as [MICRODATA], [HTML-RDFa] or as [JSON-LD] in the<head> section of the page can provide additional information to search engines to support more precise indexing. SeeBest Practice 2: Make your spatial data indexable by search engines for a more detailed discussion.
It is also very useful in such Web pages to includelinks to descriptions of theSpatial Thing in other formats (typically machine-readable formats) as well as linking to relatedSpatial Things.
In most cases, a web page about a Spatial Thing should include information on its location. This can be done by providing spatial coordinates (seeBest Practice 7: Choose coordinate reference systems to suit your user's applications for guidance on how to do this).
A common way of specifying the location of a building is to use its postal address. Most spatial applications require an address to be turned into spatial coordinates, so that its location can be marked on a map, or compared with locations of other things, a process known asgeocoding. Although a publisher could leave this process of geocoding to the data user, ideally the publisher should take responsibility for this as they are in a better position to check the accuracy of the results. Different ways of specifying addresses can sometimes lead to errors in the geocoding process.
Other approaches can be taken to specifying location.What3words is an example of a service that assigns an alternative kind of address to a location — in this case a sequence of three common words associated with a 3m by 3m square on the ground. It allows every location to be given such an address and what3words also provides a means to relate the address tolatitude andlongitude coordinates. Like conventional addresses, converting to coordinates is necessary for many spatial data applications (e.g. to calculate the distance between points or whether a point is inside a region), but the process of conversion is more reliable and precise.
Wikipedia includes web pages about many Spatial Things, for exampleFlorence Cathedral. This page provides latitude and longitude coordinates for the cathedral, as well as linking to other pages about the city of Florence and the region of Tuscany. It would be better if typical Wikipedia pages about Spatial Things were explicit about the coordinate reference system used. However, a link is also provided to the 'Geohack' service which provides detailed location information, including the CRS.
A common application ofspatial data on the Web is delivering map data in a tiled form, suitable for display in zoomable 'slippy maps'. TheOGC'sWeb Map Tile Service [WMTS] is an established standard for doing this. Other approaches in common use includeMBTiles or 'Tile layers' inGoogle MapsAPIs
Another frequent requirement is to draw markers or polygons on top of a Web map. A typical approach is for the browser to display a base map, then separately retrieve data aboutSpatial Things of interest, typically as GeoJSON [RFC7946], [GeoRSS] feeds, [GML] using the Simple Features profile [GML-SF] or [KML] files, then combine the two using appropriate JavaScript libraries. For applications involving boundary polygons of geographical areas, a common consideration is how to make this process efficient at different zoom levels. A high level of detail is appropriate when zoomed in, but many areas may be visible when zoomed out, and delivering boundaries of all of those at full detail can lead to very large amounts of data and hence poor performance, so simplified lower resolution versions of polygons may be required.
See thiscomparison of different spatial data formats to help guide the choice of which approach is best suited to your purpose.
d3.js is a widely used JavaScript library for creating visualizations in web pages. Thistutorial by Mike Bostock describes how to use D3 to work with geometrical data and display it in a web page.
Many important applications ofspatial data involve combining it with other kinds of data: for example, opening times of nearby supermarkets, or statistical information on the economy of a town. Often one or moreSpatial Things are at the center of the data analysis process.
Other applications involve distinguishing or selectingSpatial Things according to their non-spatial characteristics: hospitals with an emergency department or restaurants that serve Japanese food.
To enable such questions to be answered using data from different sources, it is important to describeSpatial Things using shared identifiers and vocabularies. This is described in [DWBP]Best Practice 10: Use persistent URIs as identifiers within datasets and [DWBP]Best Practice 15: Reuse vocabularies, preferably standardized ones.
From aspatial data perspective, the question of identifiers is discussed inBest Practice 1: Use globally unique persistent HTTP URIs for Spatial Things. How to relate aSpatial Thing to itsgeometry is described inBest Practice 5: Provide geometries on the Web in a usable way.
A common approach to encoding data to enable data integration isLinked Data [LD-BP] andRDF [RDF11-PRIMER]. The spatial aspects of the data can either be included in the RDF data model, or the entity in question can link to an external Web resource containing thegeometry in one of the standard spatial data formats. Although RDF is well-suited to important aspects of best practice, including use of URIs as identifiers and re-use of vocabularies, other data formats are also consistent with this approach. Most spatial data formats enable associating attributes of an entity alongside its geometry.
The publisher's choice of data model to represent the data will depend on what data is available and which audiences and purposes it seems most important to support. However, a reasonable general rule is that it is always useful to provide a label and a type for each entity in the data collection. (See [DWBP]Best Practice 16: Choose the right formalization level)
Common vocabularies for describing the address or location of aSpatial Thing include: [SCHEMA-ORG], [VCARD-RDF] and [LOCN]. See thiscomparison of different vocabularies for describing Spatial Things to help decide which is best for your application.
Publishing explicit relationships between the Spatial Thing of interest and other related Spatial Things helps support data integration applications: for example, providing hierarchical relationships between different kinds of administrative area.
The Scottish Government makes a lot of statistical data available via their Every statistical data point is referred to a geographical area, identified by an HTTP URI, making it easy to compare different datasets about an area of interest. See for example this page about theCity of Edinburgh Council Area.
Spatial analytics (or spatial analysis) is about deriving new insights by applying formal techniques to studySpatial Things using their topological or geometric properties. Combiningspatial data with other data (see item 3 above) is a typical preparatory step before analyzing the one or more datasets usingspatial operators, statistical algorithms, etc.
For spatial analytics on the Web, the data should be accessible via anAPI as described in13.4Spatial data access and results should be shared using the best practices described in this document. Currentspatial data infrastructures have some limitations with respect to sharing spatial data on the Web (as discussed in12.Why are traditional Spatial Data Infrastructures not enough?). Nonetheless this approach is a well-established and powerful way of distributingspatial data, based on open standards and suited to a community of expert users. It is thus one of the options a data publisher should consider when deciding how to encode theirspatial data.
In addition to publishing the data that represents the results of the analysis, maps and other forms of visualization (see item 2 above) are typically used to communicate the results.
The Meteorological Service of Canada (part of Environment and Climate Change Canada) publishes theirdaily climate observations as anOGCAPI - Features - Part 1: Core [OAF1] service. The landing page of the service can be requested with the following:
In addition,API conformance can be determined in the following way:
Finally, a listing of collections made available through the service as per below:
For example, the following request returns the climate data observations at a given named location within a specific time period:
The four main classes of application above have a wide range of requirements. To support such a wide range may require a lot of effort and cost on behalf of a data publisher. There are many aspects to the 'quality' of aspatial data publishing approach, but in general terms it relates to how well the data and approach to data delivery meet the needs of the target audience. By choosing to concentrate on only some kinds of application the publisher can keep cost down. Other factors to consider include performance (speed with which data is delivered), timeliness of updates — which can be a significant consideration if the underlying data changes frequently, software complexity or maintenance.
In many cases a mixture of technologies can be used together to find a good compromise of quality or performance and cost. The strengths of various approaches can be applied to the part of the publishing 'spectrum' that suits them best. For example, if using aLinked Data approach, one option is to keep all data in atriple store; but hybrid approaches are also possible, for example where geometrical information is stored and served from flat files, or where non-geometrical data and metadata is stored in a triple store and used to generate Web pages and machine readable descriptions ofSpatial Things, while geometrical data is indexed by software such as Lucene Spatial, PostGIS or Elasticsearch. Use of shared Web-accessible identifiers forSpatial Things can help support the interconnections between a range of diverse information systems.
[EO-QB] describes a 'spectrum of linkiness' for coverage data. At one end of the spectrum, you can assign each individual data point or pixel within a coverage (such as a satellite image) an individual identifier and web page. At the other, you can link just to an entire dataset and provide metadata for that. An intermediate approach involves dividing the data into tiles, each of which can have its own identifier and metadata. The balance of quality and cost in this example corresponds to the size of tiles that can be individually referenced, described and retrieved.
Check if spatial data is encoded, so that it can be understood and re-used reliably.
Consider the main target audience or audiences of a web page or service, and check if spatial information is provided in a way appropriate for that audience.
Relevant requirements:R-DeterminableCRS,R-Discoverability,R-GeoreferencedData,R-Linkability,R-MachineToMachine,R-SpatialRelationships
Location information is a common constituent ofspatial data and can be an important 'hook' for finding information and for integrating different datasets. There are different ways of describing the location ofSpatial Things. You can use and/or refer to the name of a well-known named place, provide position coordinates in ageometry or describe one location relative to another location. Providing multiple representations i.e. severalgeometries for oneSpatial Thing can also be helpful, allowing data users to choose the one that fits their use case. This generally requires eachgeometry to be represented as a structured object that includes not only coordinates of the positions defining the geometry but also an identifier and other properties that describe its specific characteristics. It is especially important to choose thecoordinate reference system with care and indicate it clearly for eachgeometry.
Geometry data should be expressed in a way that allows its publication and use on the Web.
The geospatial, Linked Data, and Web communities use differentgeometry formats and tools, which reflect different requirements with respect to data complexity and manipulation.
When deciding how ageometry should be described, it is therefore necessary to consider the intended uses and the related user communities. This may also imply providing alternative geometry descriptions.
This best practice helps with choosing the right format for describinggeometries, based on aspects like intended use(s), performance, and tool support. It also helps with deciding when encoding geometries as literals rather than as structured objects is a useful simplification.
This best practice is strictly correlated toBest Practice 6: Provide geometries at the right level of accuracy, precision, and size,Best Practice 7: Choose coordinate reference systems to suit your user's applications, andBest Practice 8: State how coordinate values are encoded, to which we refer the reader for more information.
The format chosen to express geometry data should:
Ideally, to enable their widest re-use,geometries should be described having in mind the geospatial, Linked Data and Web communities. This may not be always feasible, but the objective should at least be to describe geometries (also) for Web consumption.
Steps to follow:
HTTP content negotiation only works for media type, character set, encoding, and language. Consequently, it is not possible to select one representation that conforms to a given "profile" (e.g. data model, complexity level,CRS) from several that all share the same media-type; e.g. asking for the GeoJSON [RFC7946] features with "simple"geometries (compacted polygons or just points) not the "complex" geometries; or asking for the representation that uses CRS84 not Amersfoort-RD.
It is important to note that the steps outlined above are interrelated. For instance, thedimensionality of ageometry determines the set ofcoordinate reference systems that can be used, as well as the geometry encodings / representations.
Another issue to be considered when choosing the geometry format is whether the coordinate axis order is unambiguous — i.e., whether the order of the position coordinates defining each geometry is, e.g.,longitude/latitude or latitude/longitude. This specific topic is covered byBest Practice 8: State how coordinate values are encoded.
Multiple formats exist for representinggeometries (and some of them are listed inA.Applicability of common formats to implementation of best practices). It is important to distinguish between the structuredgeometry object itself and the list of two or more position coordinates that places that geometry in space and is typically the most voluminous part of geometry data. Another of the issues to be considered when choosing the format(s) to be supported is where and when to use literals or structured object formats.
that are used extensively for describingw3cgeo:Point
objects.Currently, there are two referencesgeometry formats widely used in the geospatial and Web communities, respectively, [GML] and GeoJSON [RFC7946].
[GML] provides the ability to express any type ofgeometry, in anycoordinate reference system, and up to 3dimensions (from points to solids) but is typically serialized inXML [XML11].
GeoJSON [RFC7946] supports only onecoordinate reference system (CRS84 — i.e., WGS 84 longitude/latitude), andgeometries up to 2 dimensions (points, lines, surfaces) but is serialized inJSON [RFC7159], which is often easier for browser-based Web applications to process.
To facilitate the use ofgeometry data on the Web as well inGIS, it is desirable that complex [GML]-encoded geometries be made available also in simplified form as GeoJSON [RFC7946], by applying any requiredcoordinate reference system transformation, as well as simplifying and generalizing the original geometry as needed (e.g., by transforming a 3D geometry into a 2D one). Simplified geometries may of course also be published in [GML], for example by conforming to the GML Simple Feature profile [GML-SF]. (On this topic, seeBest Practice 6: Provide geometries at the right level of accuracy, precision, and size).
Another approach to publishinggeometries on the Web is to embed them directly in Web pages. This is, for instance, the approach used by [SCHEMA-ORG], which defines several terms to specify them (seeBest Practice 2: Make your spatial data indexable by search engines for more information).
Typically, this is used just for 0D-2Dgeometries (points, lines, surfaces). Detailed and complex geometries cannot be published with this methodology, so also in this case only a very simplified representation of the originalgeometry can be published — e.g., the centroid and/or 2D bounding box. (On this topic, seeBest Practice 6: Provide geometries at the right level of accuracy, precision, and size).
Finally,RDF-based representations ofgeometries are used in the Linked Data community. This is achieved by using specific vocabularies, as [W3C-BASIC-GEO] (only for points), [GeoRSS] (points, lines, boxes, circles, polygons) or [GeoSPARQL] (for any simple features geometries). For a high-level comparison of common spatial data vocabularies, seeA.Applicability of common formats to implementation of best practices.
Thesegeometry representations are either stored with the related data, or are maintained separately, and possibly denoted with HTTP URIs (seeExample 22).
RDF representations ofgeometries can support mostgeometry types anddimensions (up to at least 2 dimensions), with any level of complexity, in anycoordinate reference system. On the other hand, many existingSemantic Web tools such astriple stores are currently not efficient enough to perform spatial queries which are complex and/or on complex geometries. It may therefore preferable to maintain geometries separately, in software platforms designed for these specific tasks.
It is nonetheless still desirable to make simplifiedgeometries available for Web consumption in GeoJSON [RFC7946] or embedded in Web pages.
In the following text and example, propertylocn:geometry
has been replaced with the more specific propertydcat:bbox
, defined in [VOCAB-DCAT-2], and adopted in [GeoDCAT-AP-20201223].
Moreover, the original GeoJSON datatype URI used in the example (corresponding tothe GeoJSON IANA Media Type URL) has been replaced withgeosparql:geoJSONLiteral
, included in the draft of the new version of [GeoSPARQL] (see issuesopengeospatial/ogc-geosparql/issues/1 andopengeospatial/ogc-geosparql/issues/48), already adopted in [GeoDCAT-AP-20201223].
The following [TURTLE] snippet shows the [GeoDCAT-AP] representation of the dataset inExample 12. Here the bounding box is provided in multiple literal encodings (WKT, [GML], GeoJSON [RFC7946]), by using propertydcat:bbox
@prefix dcat: <> .@prefix dcterms: <> .@prefix geosparql: <> .@prefix locn: <> .<> a dcat:Dataset ; dcterms:title "Adressen"@nl ; dcterms:title "Addresses"@en ; dcterms:description "INSPIRE Adressen afkomstig uit de basisregistratie Adressen, beschikbaar voor heel Nederland"@nl ; dcterms:description "INSPIRE addresses derived from the Addresses base registry, available for the Netherlands"@en ; dcterms:isPartOf <> ; dcat:theme <> ; dcterms:spatial [ a dcterms:Location ; dcat:bbox# Bounding box in WKT "POLYGON((3.053 47.975,7.24 47.975,7.24 53.504,3.053 53.504,3.053 47.975))"^^geosparql:wktLiteral ,# Bounding box in GML "<gml:EnvelopesrsName=\"\"><gml:lowerCorner>3.053 47.975</gml:lowerCorner><gml:upperCorner>7.24 53.504</gml:upperCorner></gml:Envelope>"^^geosparql:gmlLiteral ,# Bounding box in GeoJSON "{ \"type\":\"Polygon\",\"coordinates\":[[ [3.053,47.975],[7.24,47.975],[7.24,53.504],[3.053,53.504],[3.053,47.975] ]] }"^^geosparql:geoJSONLiteral ] .
In the above example, thecoordinate reference system used for the bounding box is CRS84 (equivalent to WGS 84, but with coordinate axis-order longitude/latitude), which is explicitly specified in the [GML] encoding via attribute@srsName
, and by using the relevant HTTP URI from theOGC CRS registry. Thecoordinate reference system is not specified for theWKT encoding, since CRS84 is the defaultcoordinate reference system forWKT in [GeoSPARQL], and therefore it can be omitted. Thecoordinate reference system is also not specified in the GeoJSON [RFC7946] encoding, since CRS84 is the only supportedcoordinate reference system in GeoJSON [RFC7946]. From GeoSPARQL 1.1 onwards, coordinate reference systems of serializations of the same geometry are required to be the same if the serialization format supports the representation of the general coordinate reference system. In future iterations of GeoSPARQL, the coordinate reference system is likely to be represented completely in RDF.
Always with reference toExample 12, the following snippet shows the [GML] and theRDF [RDF11-PRIMER] representations of the entry in the BAG Dutch register concerning the building where Anne Frank's house is located. For the corresponding GeoJSON [RFC7946] representation, see the relevantexample inBest Practice 8: State how coordinate values are encoded.
A [GML] representation of Anne Frank's house building:
<bag:pandgml:id="pand.3323294"><bag:identificatie>363100012169587</bag:identificatie><bag:bouwjaar>1635</bag:bouwjaar><bag:status>Pand in gebruik (niet ingemeten)</bag:status><bag:gebruiksdoel>woonfunctie</bag:gebruiksdoel><bag:oppervlakte_min>1</bag:oppervlakte_min><bag:oppervlakte_max>21</bag:oppervlakte_max><bag:aantal_verblijfsobjecten>20</bag:aantal_verblijfsobjecten><bag:geometrie><gml:MultiSurfacesrsDimension="2"axisLabels="east north"srsName="urn:ogc:def:crs:EPSG::28992"><gml:surfaceMember><gml:PolygonsrsDimension="2"><gml:exterior><gml:LinearRing><gml:posList> 120749.725 487589.422 120752.55 487594.375 120751.227 487595.129 120732.539 487605.788 120723.505 487589.745 120721.387 487585.939 120740.668 487575.07 120743.316 487573.589 120747.735 487581.337 120751.564 487579.154 120755.411 487576.96 120750.935 487569.172 120755.941 487566.288 120764.369 487581.066 120749.725 487589.422</gml:posList></gml:LinearRing></gml:exterior></gml:Polygon></gml:surfaceMember></gml:MultiSurface></bag:geometrie></bag:pand>
The correspondingRDF representation is provided in the following [TURTLE] snippet (taken from theBAG Linked Data service). NB: The RDF representation below has been complemented with additional properties (marked with# Added
) for demonstration purposes.
@prefix bag: <> .@prefix dcterms: <> .@prefix geosparql: <> .@prefix gml-ont: <> .@prefix locn: <> .@prefix rdfs: <> .@prefix schema: <> .@prefix w3cgeo: <> .<> a geosparql:Feature, bag:Pand ; rdfs:label "Pand 0363100012169587"@nl; rdfs:isDefinedBy <> ; bag:identificatiecode "0363100012169587"^^xsd:string;# Added dcterms:identifier "363100012169587"^^xsd:string ; bag:status <> ; bag:oorspronkelijkBouwjaar "1635"^^xsd:gYear;# Added dcterms:created "1635"^^xsd:gYear ;# Added locn:address <> ; geosparql:hasGeometry <> ; bag:geometriePand <>.<> a geosparql:Geometry, gml-ont:Surface ; geosparql:asWKT "POLYGON (( 4.8842353 52.375108 , 4.884276 52.375153 , 4.8842567 52.375159 , 4.883981 52.375254 , 4.8838502 52.375109 , 4.883819 52.375075 , 4.8841037 52.374979 , 4.884143 52.374965 , 4.8842069 52.375035 , 4.884263 52.375016 , 4.8843200 52.374996 , 4.884255 52.374926 , 4.8843289 52.374901 , 4.884451 52.375034 , 4.8842353 52.375108 ))"^^geosparql:wktLiteral ;# Added geosparql:asWKT "<> POLYGON (( 120749.725 487589.422 , 120752.55 487594.375 , 120751.227 487595.129 , 120732.539 487605.788 , 120723.505 487589.745 , 120721.387 487585.939 , 120740.668 487575.07 , 120743.316 487573.589 , 120747.735 487581.337 , 120751.564 487579.154 , 120755.411 487576.96 , 120750.935 487569.172 , 120755.941 487566.288 , 120764.369 487581.066 , 120749.725 487589.422 ))"^^geosparql:wktLiteral.
The differentWKT encodings in the example show alternative ways of specifying thecoordinate reference system used.
The two instances of propertygeosparql:asWKT
follow the syntax recommended in [GeoSPARQL], where the specification of thecoordinate reference system is required only if different from CRS84. The coordinate axis-order used is determined here by thecoordinate reference system, and in both cases, it is longitude / latitude (more precisely, east/north for EPSG:28992).
Example 21 shows also howgeometries forSpatial Things can be published as separate Web resources. This approach can be particularly suitable for giving access to huge geometries, consisting of hundreds of vertex positions (as the detailedgeometry of the boundaries of a geographical region), without attaching them to the relevantSpatial Things. Moreover, this allows the same geometry to be linked from (i.e., re-used by) different Spatial Things. Finally, it is possible to use mechanisms (including HTTP content negotiation) to provide access to different representations / encodings of the geometry ([GML],WKT, GeoJSON [RFC7946], etc.) as media types, thus addressing different use cases. (On this topic, see alsoBest Practice 10: Use appropriate relation types to link Spatial Things).
The following URI:
denotes an office of the Dutch Kadaster in Apeldoorn. However, itsgeometry is provided as two separate, standalone resources, denoted by the following URIs:
An additional example is theAPI of theGADM-RDF project provides an access to spatiallinked data concerning administrative areas. For instance, the following URI
returns a description of administrative area "Germany", which links to the geometry of Germany's boundaries, provided via a separate URI:
Dereferencing thegeometry URIs operated by the GADM-RDFAPI returns different geometry representations / encodings (SVG included), that can be accessed via HTTP content negotiation or by appending the format extension to the URI. For instance, URI
returns the GeoJSON [RFC7946] representation of the geometry. Directlinks to the supported geometry representations / encodings are specified in theRDF and HTML representations of the geometry.
Check if:
Relevant requirements:R-MultipleCRSs,R-BoundingBoxCentroid,R-Compressible,R-CRSDefinition,R-EncodingForVectorGeometry,R-IndependenceOnReferenceSystems,R-MachineToMachine,R-SpatialMetadata,R-3DSupport,R-TimeDependentCRS,R-TilingSupport.
Geometry data should be provided at levels of accuracy, precision, and size fit for their use on the Web.
Geometry data always provide an approximate description of the shape andextent ofSpatial Things, which is fit for specific uses. For instance, portraying a geometry on a Web map would typically not require the level of detail that is needed for using the same geometry for spatial analysis. Moreover, although a 3D description of a geometry of a building might be available, a Web map would be typically capable of portraying just its 2-dimensional footprint.
Other issues to be taken into account are network bandwidth and the processing capabilities of the target tools. For instance, ageometry of a total size of 1GB or more, could be more efficiently transmitted after being compressed. On the other hand, a tool with limited processing capabilities (as a Web browser) may not be able to efficiently handle such geometry (e.g., for displaying it on a Web map).
This best practice complementsBest Practice 5: Provide geometries on the Web in a usable way by outlining some of the approaches that can be used to publish alternative versions ofgeometry data, with respect to the level of accuracy, precision, and size, fit for the most general use cases and the reference target communities.
This best practice is not meant to provide detailed guidelines on which is the right level of accuracy and precision, file size, or geometry simplification for different use cases. For more on these topics, seeBest Practice 13: Expose spatial data through 'convenienceAPIs' andBest Practice 16: Describe the positional accuracy of spatial data.
Geometry data should be made available at (possibly different) levels of accuracy, precision, and size, taking into account:
As said inBest Practice 5: Provide geometries on the Web in a usable way, the requirements of the geospatial, Linked Data, and Web communities should be ideally taken into account also with respect to the accuracy, precision, and size ofgeometry data. Whenever this is not feasible, Web consumption requirements should at least be addressed.
A number of techniques can be used to deliver representations ofgeometries at an accuracy, precision, and size fitting the requirements of a given use case.
The following list, although not exhaustive, outlines the approaches most widely used, especially for the Web delivery and consumption ofgeometry data.
Choosing the right technique requires taking primarily into account whether the derivedgeometry is fit for the target use case. Technical limits — as network bandwidth and processing capabilities — are of course important, but secondary. Of course, the ideal situation is when you are able to find the technique offering the right trade-off between these two types of requirements.
Whatever option is used, the key requirement is that the derivedgeometry does not replace the original ones, but are made available as alternative representations.
Best Practice 5: Provide geometries on the Web in a usable way,Best Practice 16: Describe the positional accuracy of spatial data andBest Practice 13: Expose spatial data through 'convenienceAPIs' provide general guidelines that can be used for the publication of alternative representations ofgeometries, providing at the same time information on their characteristics. These include, but are not limited to, the use of different URIs for different representations, and HTTP content negotiation. Moreover, whenevergeometry is made available inRDF [RDF11-PRIMER], specific properties can be used to specify the geometry type and the level of accuracy and precision. More specific examples are included in the approaches described below.
Compressgeometry data
Using standard compression algorithms, aszip andgzip, addresses the issue of efficient transmission ofgeometry data, without information loss. Notably, some formats come with alternative compressed encodings — e.g.,KMZ is used to deliver compressed [KML] data.
Compression can be easily carried out on the fly, and it is also supported by the HTTP protocol via content negotiation — see [RFC2616],section 3.5: Content Codings.
Use formats optimizing access to and processing ofgeometry data
Some formats support a more compact description ofgeometry data, which potentially results in reducing network bandwidth consumption and/or more efficient client-side processing.
This is, for instance, the case ofTopoJSON, an extension to GeoJSON [RFC7946] which reduces redundancy in the description of a geometry, by splitting it into segments (referred to as "arcs") that can be re-used.
To achieve the same results, other formats are designed to enable the stream-based delivery of geometry data. For instance, GeoJSON Text Sequences [RFC8142] is a format designed to optimize access and processing of GeoJSON [RFC7946] data, by enabling a client application to use the received data even before the transmission is completed.
Another approach, focused on efficient client-side processing, isGeoJSON-VT, a library that enables a client to create on-the-fly vector tiles from GeoJSON [RFC7946] data.
Finally,Geohash provides a compact way of encoding 0-dimensionalgeometries (points), which, at the same time, can be used for spatial indexing.
The point coordinates of the address of Anne Frank's House (seeExample 12) can be encoded withGeohash asu173zns7thy
(corresponding to the following WGS 84 lat/long coordinates:52.37520
Provide geometries at different levels of generalization
Generalization is a traditional technique used inspatial data — first of all, in cartography — to reduce the precision and/or accuracy of ageometry for specific purposes. A typical example is provided by how geometries are portrayed in maps of different scales: for instance, a large-scale map can depict the width of a road (2-dimensional geometry), whereas, at lower scales, the same road can be shown as a line with zero width (1-dimensional geometry).
Providinggeometries at different scales or resolutions is actually one of the first criteria to be considered for addressing different use cases. This is common practice in the geospatial domain, especially, but not only, for reference data. For instance, the dataset of theNomenclature of Territorial Units for Statistics (NUTS) of the European Union is made available at five different scales — ranging from 1:1,000,000 to 1:60,000,000.
TheGADM-RDF project provides access togeometries of administrative areas at a resolution of 100m, 1km, 10km, and 100km. Each of these variants is associated with a different HTTP URI, andgeometry is made available in different formats. For instance, the geometry of Germany at 100m resolution is denoted by the following URI
, whereas the variant at 100km resolution is available from the following URI:
(see alsoExample 22).
Scale reduction uses a number of generalization techniques that can be used also outside this specific use case in order to providegeometries at different levels of accuracy and precision.
These techniques include the following:
The precision with which coordinate positions are reported often does not reflect the accuracy of the measurement. For example,latitude andlongitude reported to six decimal places corresponds to a precision of around 1cm on the ground. GPS-enabled consumer devices are accurate to within a few meters: centimeter-accuracy can only be achieved with professional equipment. Yet a lot of software defaults to use of six, seven or even more decimal places when expressing coordinate positions which may mislead users to thinking that the data is more accurate than it actually is!
Best Practice 16: Describe the positional accuracy of spatial data for a discussion on precision and accuracy.
Provide the centroid and bounding box of a geometry
Centroids and bounding boxes are another example of how ageometry can be generalized, but serving different purposes. More precisely, a centroid is meant to specify the position of aSpatial Thing by converting its actual geometry to a point, corresponding to its center. On the other hand, a bounding box provides a simplified description of the maximumextent of aSpatial Thing.
Although both these generalization methodologies result in a high-level information loss with respect to the originalgeometry, they play an important role in spatial analysis because of the topological information they provide. Moreover, centroids and bounding boxes could provide an accurate enough description of a geometry for those use cases where, respectively, the extent or precise shape of aSpatial Thing is not relevant. Finally, they are widely used also outside the geospatial domain.
Computation of centroids and bounding boxes is supported by allGIS tools and Web mapping libraries, which makes it possible to be carried out on the fly. However, performing this operation client-side can be extremely inefficient if the target tool has limited processing capabilities.
This issue can be addressed by providing access to centroids and bounding boxes as alternative representations of a givengeometry.
In the following [TURTLE] snippet, [W3C-BASIC-GEO] and [GeoRSS] are used to specify, respectively, the centroid (w3cgeo:lat
) and bounding box (georss:box
) of the 2-dimensional footprint of the building hosting Anne Frank's Museum (seeExample 21).
@prefixbag: <> .@prefixgeorss: <> .@prefixgeosparql: <> .@prefixrdfs: <> .@prefix w3cgeo: <> .<> ageosparql:Feature,bag:Pand ; rdfs:label"Pand 0363100012169587"@nl; # Detailed geometry geosparql:hasGeometry <> ; bag:geometriePand <> ; # Centroid w3cgeo:lat"52.37509"^^xsd:float ; w3cgeo:long"4.88412"^^xsd:float ; # Bounding box georss:box"52.3749,4.8838 52.3753,4.8845"^^xsd:string ;.
Added the following example on the use of [VOCAB-DCAT-2] for the specification of centroids and bounding boxes.
The same example can be written as follows, using the relevant properties from [VOCAB-DCAT-2] (dcat:centroid
@prefix bag: <> .@prefix dcat: <> .@prefix geosparql: <> .@prefix rdfs: <> .<> a geosparql:Feature, bag:Pand ; rdfs:label "Pand 0363100012169587"@nl; # Detailed geometry ... # Centroid dcat:centroid "POINT(4.88412 52.37509)"^^geosparql:wktLiteral ; # Bounding box dcat:bbox "POLYGON((4.8838 52.3749,4.8838 52.3753,4.8845 52.3753, 4.8845 52.3749,4.8838 52.3749))"^^geosparql:wktLiteral ;.
Check if:
Relevant requirements:R-BoundingBoxCentroid,R-Compatibility,R-Compressible,R-CoordinatePrecision,
Consider your user's intended application when choosing thecoordinate reference system(s) used to publishspatial data.
A multitude ofcoordinate reference systems exist because there is no perfect solution to meet all requirements:
The Earth is a complicated shape (neither spherical nor flat!):
For each (Earth-based)coordinate reference system, the topographical surface of the Earth is approximated to ageodeticdatum that is described using anellipsoid. The trouble with approximation is that nothing is perfect everywhere, which means that compromise is inevitable. Some datums, like WGS 84, provide a reasonable (but not highly accurate) fit everywhere on the Earth, while other datums (such as the European Terrestrial Reference System 1989 — as used byETRS89 / EPSG:4258) provide a better fit in a given region at the expense of accuracy elsewhere.
Spatial data is oftenprojected from the curved surface of the Earth onto a flat plane (e.g. a computer screen or a topographical map) to make it easier to compute distances between positions and calculate areas. There are many choices of projection (e.g. equirectangular, mercator, stereographic, orthographic etc.), each of which is designed for particular tasks. As withdatums, projections are often chosen to better support regional, national or local needs.
It is also worth noting that as a living planet, the Earth continues to change its shape; for example, continental drift moves Australia north-eastwards several centimeters each year and New Zealand shifts in multiple directions. To retain accuracy,datums need to be adjusted from time to time — as is the case of the New Zealand Geodetic Datum (NZGD2000) that is frequently revised to take account of earth deformations.
Geodesists refer to those coordinate reference systems which are continually updated so that the same set of coordinates refers to the same place on (or near) the surface of the earth as 'plate-fixed' or 'static coordinate reference systems'; others (such as WGS 84) are 'earth-fixed' or 'dynamic' (a set of coordinates will not resolve to the same spot on the moving surface of the earth).
If your intended application requires a combination of positional accuracy and persistence over time, then use a static coordinate reference system or ensure that each set of coordinates is accompanied by the date it was measured.
See for further explanation.
Sometimes we don't want to measure relative to the surface of the Earth at all:
Spatial data such as descriptions of the built environment, geological surveys, satellite imagery, etc. are often captured and stored in anengineeringcoordinate reference system as measurements from a local datum. For example,XY survey coordinates relative to a building corner, pixel positions within the image swath of a satellite camera, or distance along a line from a fixed origin point.
Although it is possible to convert coordinates from oneCRS to another, many users will be put off by the need to do so. Furthermore, the need for such transformations introduces a point where errors can be introduced to thespatial data — especially where users have limited expertise with spatial data.
When publishingspatial data, it is best to help users avoid the need for them to transform spatial data betweencoordinate reference systems themselves by providing data in a form, or forms, which they can use directly. To determine which coordinate reference system(s) are needed, data publishers must consider the intended applications of their user community.
Spatial data is provided in acoordinate reference system, or systems, that are sensitive to the needs of user's intended applications.
Most of a publisher's anticipated user community do not need to transform coordinate values prior to using the spatial data.
Whichevercoordinate reference system is chosen for the publication ofspatial data, it is imperative that that choice is made clear to users. Please refer toBest Practice 8: State how coordinate values are encoded for further details.
The first thing that publishers ofspatial data need to do is consider their audience.
When publishingspatial data on the Web, the largest community of potential users will be unknown: anyone might find and use data published on the Web! To support thisunanticipated reuse, we recommendalways publishing yourspatial data using a globalcoordinate reference system which allows spatial data from multiple sources to be readily combined for display or computation. Forgeospatial data with point, line or polygongeometries (i.e.vector data), WGS 84 Lat/Long (EPSG:4326) or WGS 84 Lat/Long/Elevation (EPSG:4979) are good choices as many of the tools and applications used by Web developers are set up to use data from GPS-enabled mobile devices that all use WGS 84. Where you have geo-imagery (i.e.raster data, comprised of a rectangular pattern of pixels on a flat plane) it is best to use Web Mercator (EPSG:3857) which has near-global extent.
Data publishers should be aware that the geodeticdatum used by Web Mercator is spherical and not true to the shape of the earth. At highlatitudes, this results in positional differences of up to 20 kilometers when compared with WGS 84. However, many Web-mapping tools transparently perform the necessary transformations to ensure that geospatial vector data is correctly plotted on the underlying base map.
Where considerations of the known user community (or communities) call for differentcoordinate reference systems, we recommend publishingspatial data in multiple representations: one for each of the prioritized coordinate reference systems. Clearly, the number of representations provided needs to be determined with respect to the associated effort. However, remember that a decision not to publish data in a priorityCRS will result in each member of your user community needing to do that task — or them not using your data.
Common reasons for needing to publish in additional coordinate reference systems include:
In writing up a list of examples of the best practices in action I noted that best practice 7 has some sections formatted as (and with class of) example but they're unnumbered and without an ID so they can't be directly linked to.
"1. publication through government data portals that require use of a projected CRS defined by the national mapping agency - and similar legislative requirements;
The Basisregistraties Adressen en Gebouwen (BAG), or Basic Registers for Addresses and Buildings, provided by Kadaster, publishes data in both OGC CRS84 (using the WGS 84 geodetic datum) and the Amersfoort / RD (EPSG:28992) coordinate reference systems.
The INSPIRE Directive 2007/2/EC of the European Commission requires that the European Terrestrial Reference System 1989 ETRS89 (EPSG:4258) is used for the referencing of spatial datasets."
publication through government data portals that require use of aprojected CRS defined by the national mapping agency — and similar legislative requirements;
TheBasisregistraties Adressen en Gebouwen (BAG), orBasic Registers for Addresses and Buildings, provided byKadaster, publishes data in bothOGC CRS84 (using the WGS 84 geodeticdatum) and the Amersfoort / RD (EPSG:28992)coordinate reference systems.
TheINSPIRE Directive 2007/2/EC of the European Union requires that the European Terrestrial Reference System 1989 ETRS89 (EPSG:4258) is used for the referencing of spatial datasets.
There are many cases where WGS 84, or any Earth-basedcoordinate reference system, are not appropriate. For example, when describing location relative to other celestial bodies (e.g. Lunar geography, andareography — thegeography of Mars), the arrangement of cells on a microscope slide, tapes in a mass storage unit, or the position of an artifact in a museum warehouse. In such cases, publication ofspatial data in WGS 84 is either impossible or provides no value.
That said, many of these best practices are still relevant. In particular, seeBest Practice 9: Describe relative positioning.
Discussion of coordinate system transformations is beyond the scope of this best practice document: converting coordinates betweenCRSs that use differentdatums and or projections can be very involved. This is especially true where elevation values are missing from the source data. For reference, EPSG guidelines say that in such cases reasonable assumptions are:
That said, we note that there are several open source software implementations available to help users do such conversions. These include: theGeospatial Data Abstraction Library (GDAL), theCartographic Projections Library (PROJ.4), its associatedJavaScript implementation (PROJ4.JS) and theApache Spatial Information System Library (SIS).
Check thatgeospatial data (i.e. data about things located relative to the Earth) is available, as a minimum, in a globalcoordinate reference system: for vector data, this should be WGS 84 Lat/Long (EPSG:4326) or WGS 84 Lat/Long/Elevation (EPSG:4979); for raster data this should be Web Mercator (EPSG:3857).
Relevant requirements:R-AvoidCoordinateTransformations,R-CoordinatePrecision.
Provide enough information for users to determine how coordinate values are encoded.
Thegeometry ofSpatial Things is described using position coordinates; for example,latitude andlongitude. Because coordinates describe a position relative to adatum (e.g. zero latitude is the equator and zero longitude is the prime meridian — often the Greenwich Meridian), it is important to understand both the datum and the units that are used for coordinates along with the order which the coordinate axes are defined: thecoordinate reference system (CRS).Spatial data is published in a wide variety of CRS. This variety can create confusion and inconsistencies in using and interpreting spatial data. Unless the CRS is known, errors are likely to be introduced when determining the position and extent of aSpatial Thing on the Earth and this makes comparing or combining spatial data from different sources extremely problematic.
Where the application requires accuracy over time but the data uses a dynamic coordinate system, you will also need to provide the 'epoch' - the data the coordinates were recorded; seeBest Practice 7.
Sufficient information is provided to enable coordinates to be related to the correct position, thereby enablingspatial data to be correctly interpreted by humans and software agents.
Spatial data from different sources can be combined without introducing unwarranted positional errors.
A user ofspatial data will need to know:
There is a predominant view that "I just need to useLat andLong — and I'm done".
Although the clear majority ofspatial data published on the Web uses WGS 84 Long/Lat, westrongly recommend that spatial data is published with all the necessary information to interpret coordinate values. Even where the use oflatitude andlongitude angular measurements is obvious; the choices ofdatum and units of measurement have an impact. In particular, angular measurements appearing as floating point numbers are most likely to be provided in decimal degrees, but could also be in radians or gons (also known as grads).
The problem is that the assumption of a "predominant view" leads to ambiguity. For example, many spatial data users work entirely with information provided in their nationalcoordinate reference system (such as theDutch Amersfoort / RDEPSG:28992 orBritish National GridEPSG:27700) which make all coordinates in WGS 84 Long/Lat (especially the negative numbers) utterly perplexing.
In practice, a publisher not documenting theirCRS and presuming thatlatitude andlongitude can be treated as cartesian is often bailed out by fuzzy use cases and software that takes care of projections. However, CRS and coordinateaxis order ambiguity leads sooner or later to serious and avoidable errors, while ignorance ofdatums andmap projections leads to broken applications. Furthermore, these practices will also become less and less tenable as new applications such as Augmented Reality require higher data precision and accuracy.
There are five common ways that this information can be provided:
Describe thecoordinate reference system in the dataset metadata.
@prefixex: <> .@prefixdcat: <> .@prefixdcterms: <> .@prefixskos: <> .ex:ExampleDataset adcat:Dataset ; dcterms:conformsTo <> .<> a dcterms:Standard, skos:Concept ; dcterms:type <> ; dcterms:identifier""^^xsd:anyURI ; skos:prefLabel"WGS 84 / UTM zone 30N"@en ; skos:inScheme <> .
The example above illustrates how to describe thecoordinate reference system used for a dataset within [GeoDCAT-AP] metadata. TheconformsTo
property from [DCTERMS] is used to assert the relationship between dataset andCRS in the same way that conformance with astandard is expressed in [VOCAB-DQV].
Dataset metadata forspatial data should always provide details of theCRS used. For more information about dataset metadata, please refer toBest Practice 15: Include spatial metadata in dataset metadata.
Provide each coordinate value with explicit labels and provide metadata to indicate what each label means.
@prefix w3cgeo: <> .@prefixdcterms: <> .:myPointOfInterest a w3cgeo:SpatialThing ; dcterms:description"Anne Frank's House, Amsterdam." w3cgeo:lat"52.37514"^^xsd:float ; w3cgeo:long"4.88412"^^xsd:float ; .
The labels (or terms)w3cgeo:lat
are provided by the [W3C-BASIC-GEO] vocabulary which states that it is:
A vocabulary for representinglatitude,longitude and altitude information in the WGS 84 geodetic referencedatum.
The terms themselves (plusw3cgeo:alt
) are defined with all the necessary information as follows:
<scripttype="application/ld+json">{"@context" : {"@vocab" :"" },"myPointOfInterest" : {"@type" :"Place","geo" : {"@type":"GeoCoordinates","latitude":"52.37514","longitude":"4.88412" } }}</script>
In the example above, the labelslatitude
are defined in [SCHEMA-ORG], as indicated by the [JSON-LD] key@vocab
. The associated definitions in [SCHEMA-ORG] are:
The definitions provided in [SCHEMA-ORG] do not indicate the unit of measure. However, we have included this example as [SCHEMA-ORG] is very commonly used. The unit of measure used forlatitude
are decimal degrees, and decimal meters is used for the remaining coordinate position propertyelevation
The metadata for axis labels may also be provided in the documentation for anAPI from which thespatial data is accessed. For more information on documentingAPIs, please refer to [DWBP]Best Practice 25: Provide complete documentation for yourAPI.
GID,On Street,Long,Lat,Species,Trim Cycle,Diameter at Breast Ht,InventoryDate,Comments,Protected1,ADDISON AV,-122.15649,37.44096,Celtis australis,Large Tree Routine Prune,11,10/18/2010,,2,EMERSON ST,-122.15675,37.44096,Liquidambar styraciflua,Large Tree Routine Prune,11,6/2/2010,,6,ADDISON AV,-122.15630,37.44115,Robinia pseudoacacia,Large Tree Routine Prune,29,6/1/2010,cavity or decay; trunk decay; codominant leaders; included bark; large leader or limb decay; previous failure root damage; root decay; bewareof BEES,YES
In this example (adapted from the City of Palo Alto tree operations database and published astabular data and as aninteractive map) the coordinate position of each tree is specified using separate columns (Long
We see the definitions of thoseLong
columns provided in the dataset metadata, in this case a tabular metadata document, as per approach (1) above.Long
are mapped onto the definitions provided by [W3C-BASIC-GEO] to ensure that the meaning of the data values in those columns is clear:
{"@context": ["", {"@language":"en"}],"@id":"","url":"tree-ops-db.csv","dcterms:title":"Tree Operations", ..."tableSchema": {"columns": [{"name":"GID","titles": ["GID","Generic Identifier" ],"dcterms:description":"An identifier for the operation on a tree.","datatype":"string","required":true,"suppressOutput":true }, {"name":"on_street","titles":"On Street","dcterms:description":"The street that the tree is on.","datatype":"string" }, {"name":"Long","titles":"Longitude","dcterms:description":"The WGS 84 longitude of the tree (decimal degrees).","propertyUrl":"""datatype": {"base":"number","minimum":"-180","maximum":"180" } }, {"name":"Lat","titles":"Latitude","propertyUrl":"""dcterms:description":"The WGS 84 latitude of the tree (decimal degrees).","datatype": {"base":"number","minimum":"-90","maximum":"90" } }, ..."primaryKey":"GID","aboutUrl":"{GID}" }}
Use a data format that specifies axes, their order,datum and unit of measurement for coordinates.
HTTP/1.1200 OKDate:Sun, 05 Mar 2017 17:12:35 GMTContent-length:543Connection:closeContent-type:application/geo+json{"type":"Feature","geometry": {"type":"Polygon","coordinates": [ [ [4.884235,52.375108], [4.884276,52.375153], [4.884257,52.375159], [4.883981,52.375254], [4.883850,52.375109], [4.883819,52.375075], [4.884104,52.374979], [4.884143,52.374965], [4.884207,52.375035], [4.884263,52.375016], [4.884320,52.374996], [4.884255,52.374926], [4.884329,52.374901], [4.884451,52.375034], [4.884235,52.375108] ] ] },"properties": {"name":"Anne Frank's House" }}
The media typeapplication/geo+json
is used to designate that content is provided in GeoJSON format, as specified in [RFC7946].
[RFC7946]Section 4. Coordinate Reference System provides all the necessary information to interpret the coordinates, stating that:
Thecoordinate reference system for all GeoJSON [RFC7946] coordinates is a geographic coordinate reference system, using the World Geodetic System 1984 (WGS 84) [WGS84]datum, withlongitude andlatitude units of decimal degrees. This is equivalent to the coordinate reference system identified by the Open Geospatial Consortium (OGC) URN urn:ogc:def:crs:OGC::CRS84. AnOPTIONAL third-position elementSHALL be the height in meters above or below the WGS 84 referenceellipsoid. In the absence of elevation values, applications sensitive to height or depthSHOULD interpret positions as being at local ground or sea level.
<scripttype="application/ld+json">{ "@context" : { "@vocab" : "" }, "myPlaceOfInterest" : { "@type" : "Place", "name" : "Anne Frank's House", "geo" : { "@type": "GeoShape", "polygon": "52.375108,4.884235 52.375153,4.884276 52.375159,4.884257 52.375254,4.883981 52.375109,4.883850 52.375075,4.883819 52.374979,4.884104 52.374965,4.884143 52.375035,4.884207 52.375016,4.884263 52.374996,4.884320 52.374926,4.884255 52.374901,4.884329 52.375034,4.884451 52.375108,4.884235" } }}</script>
The [SCHEMA-ORG] definition ofGeoShape
The geographic shape of a place. A GeoShape can be described using several properties whose values are based on latitude/longitude pairs. Either whitespace or commas can be used to separatelatitude andlongitude; whitespace should be used when writing a list of several such points.
State within the data itself whichcoordinate reference system is used.
<gml:PolygonsrsDimension="2"axisLabels="east north"srsName=""><gml:exterior><gml:LinearRing><gml:posList> 120749.725 487589.422 120752.55 487594.375 120751.227 487595.129 120732.539 487605.788 120723.505 487589.745 120721.387 487585.939 120740.668 487575.07 120743.316 487573.589 120747.735 487581.337 120751.564 487579.154 120755.411 487576.96 120750.935 487569.172 120755.941 487566.288 120764.369 487581.066 120749.725 487589.422</gml:posList></gml:LinearRing></gml:exterior></gml:Polygon>
The example above encodes the polygon for Anne Frank's House in [GML]. TheXML [XML11] attributesrsName
(srs meaning "spatial reference system") refers to theAmersfoort / RD CRS (EPSG:28992) used in the Netherlands. Also note that additional useful information (srsDimension
) is provided within the document for easy reference.
{ "@context": { "geosparql" : "" , "rdfs" : "" , "asWKT" : { "@id" : "" , "@type" : "geosparql:wktLiteral" } } , "@id" : "" , "@type" : "" , "rdfs:label" : "Building 0363100012169587" , "geosparql:hasGeometry": { "geosparql:asWKT" : "<> POLYGON ((52.375108 4.884235, 52.375153 4.884276, 52.375159 4.884257, 52.375254 4.883981, 52.375109 4.883850, 52.375075 4.883819, 52.374979 4.884104, 52.374965 4.884143, 52.375035 4.884207, 52.375016 4.884263, 52.374996 4.884320, 52.374926 4.884255, 52.374901 4.884329, 52.375034 4.884451, 52.375108 4.884235))" }}
The "Well Known Text" (WKT) encoding, itself defined in [SIMPLE-FEATURES], is extended by [GeoSPARQL] to include designation of thecoordinate reference system used, which in turns determines the coordinate axis-order. The example above encodes the polygon as a [GeoSPARQL]wktLiteral
data type, designating thecoordinate reference system as<>
(EPSG:4326) — WGS 84 Lat/Long.
When using thewktLiteral
datatype specified in [GeoSPARQL], thecoordinate reference system URI may be omitted. In such a case, WGS 84 Long/Lat (urn:ogc:def:crs:OGC::CRS84
) is used. Please refer to [GeoSPARQL]Requirement 11 for more details.
The Basisregistraties Adressen en Gebouwen (BAG — the Dutch "Basic Registers for Addresses and Buildings"), provided byKadaster, uses this default behavior. Anne Frank's House, is identified using the URI
.HTML,JSON,TTL andXML representations are available.
It is worth noting that, in the [SIMPLE-FEATURES] definition ofWKT, the coordinate axis order is by default longitude / latitude, irrespective of thecoordinate reference system used. The same applies toEWKT (Extended WKT) — a PostGIS extension toWKT supported also by otherGIS tools -, which includes a parameter (SRID
) for specifying the coordinate reference system.
For this reason, whenever usingWKT to encodegeometries, it is important that the referenceWKT specification can be unambiguously determined.
Support advertising the usedCRS in the endpoint serving the data.
Because of the inconsistent provision of CRS metadata in geospatial encodings and the continued confusion caused by the axis order of coordinates,OGCAPI - Features part 2 [OAF2] defines a mechanism for a server to clearly and unambiguously assert the CRS and axis order being used in a response document independent of the requested output format. The method used is an HTTP header namedContent-Crs
containing a URI identifying theCRS.
$ curl -i"" HTTP/1.1200 OKDate: Sun,24 May202015:30:56 GMT Content-Type: application/geo+json Content-Language: en Content-Crs: Link: ; rel="self"; title="This document"; type="application/geo+json"Link: ; rel="alternate"; title="This document as HTML"; type="text/html"Link: ; rel="collection"; title="The collection the feature belongs to"Vary: Accept-Language,Accept-Encoding Content-Length:1064 ...
For a givenspatial data publication, check that users can find information about the coordinate axes, their order, and unit of measurement, plus thedatum used.
Relevant requirements:R-DeterminableCRS,R-CRSDefinition,R-GeoreferencedData,R-LinkingCRS.
Sometimes instead of usinggeometry and coordinates to describe a location, we want or need to describe it in relation to another location. In that case relative positioning can be used.
Provide a relative positioning capability in which one entity can be positioned relative to another entity.
Geocentriccoordinate reference systems describe position relative to the earth itself. It can also be valuable or even necessary to describe the position of an entity relative to a second entity. In some cases, this is a navigation convenience, for example, a tour kiosk might be described as located between the Boston Common Frog Pond and the Park Street T entrance, or in one's lower left view when looking up at the Statehouse. In other cases of moving or generalized entities, it may be that the entity can only usefully be given a relative position. For example, a package is reported left on seat 32L1 on the #59 bus, or part number PRG5460 is always located at position (51, 73, 3) in Acme warehouses.
It should be possible to describe the location of an entity in relation to one or more other entities or places, instead of specifying its own geocentric position orgeometry.
The relative positioning descriptions should be machine-interpretable and/or human-readable as required by the intended application. The positions and/orgeometries of reference entities, if available, should be retrievable through theirlink relations.
Positioning of one entity (A) relative to another referenced entity (B) is a combination of two factors: the referencing target, and the means of relative positioning. "Geocentric" referencing targets the planet itself or at least a fixed point on it. "Allocentric" referencing targets another entity. "Egocentric" referencing targets a particular field of view of an observer or camera. Positioning can take the form of a completecoordinate reference system (e.g. engineeringCRS), a qualitative relation such as "beside", or a quantitative relation such as "30m northwest"
Engineering CRS | Qualitative Relation | Quantitative Relation | |
Geocentric | Coordinate position A relative to a fixed earthdatum | Not Applicable | Not Applicable |
Allocentric | Coordinate position A relative to a fixed, mobile, or generic entity B | A "next to" B | A "20m south" of B |
Egocentric | Coordinate position A within the field of view B | A in "lower left corner" of the field of view B | A "30 deg right of center" in field of view B |
"Metes and Bounds" are a widely-used system for defining land parcel boundary edges as cardinal directions and distances relative to survey markers or other landmarks. This would be considered an allocentric quantitative relation type of relative positioning
The positions of pixels in an image captured by a satellite or other camera sensor are originally recorded relative to the field of view of the sensor. A model of the sensor optics, and platform position and orientation, together with transmission path effects (if available), may be used later to derive geocentric pixel positions.
In hydrology, positions of river features and/or observations such as water depth are often recorded as distance along a stream relative to a well-known origin point such as a stream confluence. This provides a reproducible form of positioning even if thegeometry of the stream has not been precisely determined.
Augmented or mixed reality content is presented to the viewer in positions both relative to the real-world entities that it augments and relative to the viewer's visual perspective on those entities. For example, an informational callout needs to be juxtaposed with its target but also occupy the user's field of view in a meaningful fashion.
Check that, when positions of entities are described as relative to other entities, these descriptions can be interpreted by a machine as well as humans, and the positions of the reference entities can be retrieved through theirlink relations.
Relevant requirements:R-MachineToMachine,R-SamplingTopology.
The fundamentals oflinks and how they are encoded are described in13.1.3Linking data. This section provides advice on the resources to use as thesource andtarget oflinks in spatial data, and the common categories of link relation types that might be used.
Ensure that hyperlinks between Spatial Things and related resources use appropriate semantics.
Geography is often described as the "glue that binds Linked Data"; thelinks betweenSpatial Things — and between other resources andSpatial Things — describe how the world around us is structured and interrelated and form an important facet of the Web of Data.
Spatial relationships can often be derived mathematically based ongeometry — but this can be computationally expensive.Topological relationships such as these can be asserted, thereby removing the need to do geometry-based calculations. A useful secondary benefit is that these relationships are easier for humans to understand!
Different authorities and agencies seek to describe the world around them by publishing spatial data, and in doing so, each minting their own URIs (as recommended inBest Practice 1: Use globally unique persistent HTTP URIs for Spatial Things). WhereSpatial Things are of common interest to multiple agents, it is almost inevitable that a givenSpatial Thing will end up being identified with several URIs. Given necessary due diligence, multiple identifiers may be linked, thereby supporting a combination of multiple sets of information and yielding new perspectives on Spatial Things.
Application domains often require Spatial Things to be related; to convey the correct meaning, specific link relation types need to be used.
Spatial things are related to other resources in the Web of data using links with appropriate semantics.
Your data is more interoperable.
Before examining the link relation types that might be used in spatial data, let's considerwhat we should link to.
Link to theSpatial Thing.
Thegeometry description orextent of aSpatial Thing may be expressed using an object with its own URI. For example:
@prefixrdfs: <> .@prefixadmingeo: <> .@prefixgeom: <><> aadmingeo:District ; rdfs:label"City of Edinburgh" ; geom:extent <> .<> a geom:AbstractGeometry ; geom:asGML"<gml:MultiPolygon>...</gml:MultiPolygon>"^^rdf:XMLLiteral ; geom:hectares27300.411 .
As can be seen in the example above, thegeometry30505-10
is anattribute of theCity of Edinburgh (osuk:7000000000030505). If your intent is to make a statement about, or refer to, the real-world entity then make sure you link to theSpatial Thing rather than thegeometry. Furthermore, note that the geometry record may be updated and re-published with a new identifier, for example, if the city boundary was resurveyed and would then result in a brokenlink.
Data publishers should also be aware of a common pattern used in the publication ofLinked Data, where theSpatial Thing and the information resource that describes it are identified separately — often, but not always, using/id
as part of the URI forSpatial Thing, and/doc
for the corresponding page/document/record. When the URI for the Spatial Thing is dereferenced, aHTTP 303 (see other)
response is used to redirect the browser to the page/document/record URL. For example:
redirects to
redirects to
While this disambiguation has its advantages, it often seems to confuse users (and even some experts). Be aware of thisredirect pattern, and make sure you use the correct URI i.e. the identifying one — especially if you're copying the URI from a browser's address bar which usually ends up showing the page/document/record URL.
Link toSpatial Things from popular repositories.
Linking with URIs from popular repositories may improve the discoverability of your data. Not only does this provide users with better context by enabling them to browse the information published by the popular repository, but it also helps relate your data with datasets from other parties who have also used those URIs as points of reference.
There are many popular repositories containing sets of identifiers forSpatial Things; the following list suggests the primary sources worth checking:
Finding out which national open spatial datasets are available, and how they can be accessed, currently requires some insider knowledge — in most cases because these datasets are often not easily discoverable. Look for national data portals / geoportals such asNationaal Georegister (Dutch national register of spatial datasets) orDataportaal van de Nederlandse overheid (Dutch national governmental data portal).
Once you've found well-known URIs forSpatial Things that you want to link to, proceed to createlinks using properties such as those described above —owl:sameAs
(if you're careful!) andgeosparql:sfWithin
, or perhaps qualitative relationships likegeonames:nearby
or theproposedschema:samePlaceAs
(see related discussion in15.6Defining that two places are the same).
However, don't try to makelinks toeverything. It is not always feasible to link yourSpatial Things to well-known resources. For example, if you were maintaining a registry of cultural heritage in Amsterdam, it would be reasonably simple to look up identifiers for the city's 50 or so museums and map these to your Spatial Things. But it would be a huge task for, say, a topographic mapping agency to cross-reference their entire catalogue of named places containing tens of thousands of Spatial Things with third-party resources (although in the spirit of crowd-sourcing, if someone else found those links useful, they may take on the task of relating the Spatial Things and publishing those relationships to the Web as a complementary resource!). In essence, you should only create the data that you have the resources to maintain.
Now, let's take a look at link relation types that may be applicable to spatial data. These fall into three broad categories: spatial relations, equality relations, and domain-specific relations.
In this best practice document, we cannot cover all the possible vocabularies and ontologies that provide link relation types for spatial data. Other than a few areas of specific guidance, we are not recommending specific vocabularies for spatial linking. Instead, we hope to have introduced patterns that show the types of spatial linking that might be used and leave it to spatial data publishers to determine which specific vocabulary best suits their purpose. In this regard, [DWBP]section 8.9 Data Vocabularies and, in particular, [DWBP]Best Practice 15: Reuse vocabularies, preferably standardized ones are highly relevant.
Also, readers should note that in many cases, there will often be value in linkingSpatial Things with multiple relationships — each of which provides different semantics. Having identified your intended user communities and the vocabularies that they commonly use, choose thoselink relation types that meet their specific needs, and then add more generalizedlink relation types to support broader reuse of your data.
However, data publishers should only assert those relationships that they know about and that they think will be of interest to their user community. Don't try to cover all possible requirements! That said, publishers should try to avoid making assumptions about what the user may or may not know. For example, users may lack the expertise or resources to calculate a topological relationship, or lack the domain knowledge to determine how twoSpatial Things are related, if at all. As the data publisher, you are likely to be in a better position to make these judgements than the user — so help them out by making these relationships clear.
Spatial relationships
Topological relationships betweenSpatial Things can be computed based on assessment of theirgeometry. [GeoSPARQL] defines families of topological relationships (based on theDE-9IM pattern) that, in mathematical terms, specify the spatial dimension of the intersections of the interiors, boundaries and exteriors of two geometric objects that may be 2-dimensional (e.g. area), 1-dimensional (e.g. linear) or 0-dimensional (e.g. point).
Most commonly used are the simple feature relationship family, described in [SIMPLE-FEATURES] section6.1.15.3 Named spatial relationship predicates based on the DE-9IM. The set of seven named relationships, orspatial predicates, and their associated [GeoSPARQL] properties are listed below:
We recommend the use of the Simple Features relation families for describing topological relations between points, lines and areas. Further details are provided in [GeoSPARQL] section7 Topology Vocabulary Extension.
<scripttype="application/hal+json">{"ex:type-nl":"brug","ex:type-en":"bridge","ex:name":"Lelieslius","_links": {"self": {"href" :"" },"curies": [ {"name":"geosparql","href":"{rel}","templated":true } , {"name":"ex","href":"{rel}","templated":"true" } ],"geosparql:sfCrosses": {"href" :"" } },"_embedded": {"ex:type-nl":"kanaal","ex:type-en":"canal","ex:name":"Prinsengracht","_links": {"self": {"href" :"" } } }}</script>
The example above uses theHypertext Application Language (HAL) conventions for expressing hyperlinks inJSON [RFC7159]. It illustrates how one would indicate usinggeosparql:crosses
that two linearSpatial Things, a bridge and a canal, cross over each other.
The spatial predicates specified in [GeoSPARQL] describe 2-dimensional topological relations. There is no evidence of common practice for describing 3-dimensional topological relationships.
In addition to the mathematically precise spatial predicates described above, several vocabularies define similar relationships but without the formal mathematical underpinning. For example, [SCHEMA-ORG] defines a pair of basic containment relationships for use withschema:Place
:The basic containment relation between a place and another that it contains.schema:containedInPlace
:The basic containment relation between a place and one that contains it.It is also commonplace to usespatial relationships to convey distance (,nearby orfar-away) and direction (e.g.left,inFrontOf,astern andbelow). However, we find no evidence that points to use of common vocabularies to express these relationships — perhaps because these relationships are often subjective and dependent on application context (e.g. the meaning of “near” will be quite different between an endurance cycling App and the App I use to find the Bluetooth tag attached to my house keys!).
Two notable examples of distance relations are:
which states "We do not say much about what 'near' means in this context; it is a 'rough and ready' concept."; andgeonames:nearby
which simply states, "Afeature close to the referencefeature".<scripttype="application/geo+json">{"id" :"","type":"Feature","geometry": {"type":"Polygon","coordinates": [ [ ... ] ] },"properties": {"":"Anne Frank's House","" : ["","","", ... ] }}</script>
This example snippet, adapted to use the GeoJSON [RFC7946] format, shows a list ofSpatial Things (e.g. Westerkerk, Homomonument and Westertoren) that are deemed 'nearby' Anne Frank's House according toGeoNames.
TheJSON [RFC7159] format provides only simple primitive types; string, number, boolean etc. The lack of a datatype for URIs means that they must be encoded as strings. As such, conventions (such as those defined inHAL) are required to tell applications that a given string value is a URI. However, GeoJSON [RFC7946] does not define any conventions for describing URIs and forbids any extension of the data format specification.
To mitigate this, details about object types, etc. included in data payload should be provided in the documentation for theAPI or service end-point from which the data is accessed. See [DWBP]Best Practice 25: Provide complete documentation for yourAPI for further details.
Synonyms and equality
As described above, it is not uncommon for aSpatial Thing to be identified using more than one URI (also known as the "non-unique naming problem"). If you think that this is the case, the propertyowl:sameAs
may be used to express this. However, caution is advised asowl:sameAs
is an extremely strong statement; literally "these two URIs identify the same resource". As there is onlyoneSpatial Thing, all the properties and attributes returned when resolving any of the equated URIs are considered to apply to thatSpatial Thing. Given that spatial data is often published by different parties, each concerned with their own perspective, theSpatial Thing equality is often difficult to determine and depends heavily on the semantics involved.
So, the advice is: if in doubt, don't useowl:sameAs
By way of example, let's explore some data for Edinburgh.
TheCity of Edinburgh Council Area (e.g. the geographical area that Edinburgh City Council is responsible for) is identified by theOffice for National Statistics (the recognized national statistical institute of the UK) using their GSS code (a 9 character alpha numeric identifier)S12000036
and the URI
. At the same time, the devolved government in Scotland, operating under its own jurisdiction, retains the GSS code but uses the URI
. Furthermore, theOrdnance Survey maintain yet another URI for the City of Edinburgh Council Area as part of its 'Boundary Line' service that contains administrative and statistical geography areas in the UK:
. Similarly, Geonames identifies Edinburgh, asecond-order administrative division, as
. All of these URIs refer to the sameSpatial Thing and are equated usingowl:sameAs
@prefixowl: <> .@prefixscotgov-stat: <> .@prefixukgov-stat: <> .@prefixosuk: <> .@prefixgeonames: <> .scotgov-stat:S12000036owl:sameAsukgov-stat:S12000036 .osuk:7000000000030505owl:sameAsukgov-stat:S12000036 .geonames:2650225owl:sameAsukgov-stat:S12000036 .
Also note that in this [TURTLE] snippet one could easily include additional properties to help users determine whether thelink is worth traversing, such as providing human-readable labels and specifying thetype designated by each data publisher.
In contrast, the resource identified by
defines thenamed place Edinburgh — a colloquial definition for the city itself. This is not the same as theCity of Edinburgh Area and therefore use of theowl:sameAs
relationship is inappropriate.
The mechanics of determining whether the information provided when resolving two or more URIs does indeed describe the sameSpatial Thing is a complex topic all in its own right and way beyond the scope of this best practice document. Tools such asOpen Refine and theSilk Linked Data Integration Framework are designed to work with, transform and integrate heterogeneous data sources. Their documentation may provide further insight regarding these challenges.
Given the very strong semantics of theowl:sameAs
property, alternative properties with weaker semantics are commonly used. Examples include:
defined by [SCHEMA-ORG] whose description states:
URL of a reference Web page that unambiguously indicates the item's identity. E.g. the URL of the item's Wikipedia page, Freebase page, or official website.
defined, with the description:
Having two things that are not the owl:sameAs but are similar to a certain extent. It is thought of being used where owl:sameAs is too strong but rdfs:seeAlso is too loose.
All of the properties list above, are concerned with equality or similarity about resources themselves. However, we often want to talk about the similarity of Spatial Things in terms of location orplace. Spatial relations (seeabove) can be used to describe how locations are related — either using rigorous topological relationships derived from geometry, such asgeosparql:sfEquals
, or ones without formal mathematical underpinning, such asgeonames:nearby
. Butplace is a social concept that reflect how we humans perceive the space around us, often with a vague or imprecise notion of location; you can’t always define a boundary for a place likeThe Sahara because not everyone agrees where its edge lies!
Talking of places, theCity of Edinburgh [Administrative] Area andEdinburgh thenamed place are strongly related; you might say that they are thesame place if that makes sense for your application. This also provides an example where it is worthwhile to provide multiple relationships between Spatial Things: Ordnance Survey uses thewithin
link relation type to relate thenamed placeEdinburgh and theCity of Edinburgh (osuk:7000000000030505) administrative area.within
complements a qualitativesame-place-as relation between two places.
However, while we see people wanting to assert such qualitativesame-place-as relationships based on human perception of place, there is no evidence of a best practice in how to achieve this; see15.6Defining that two places are the same for more details about possible approaches that could be adopted.
Domain-specific relationships involvingSpatial Things
In addition to thespatial relationships that are applicable to a wide variety of domains, there are a huge number of cases where asserting a relationship betweenSpatial Thing is useful. Clearly, enumerating all these cases is more than we can do here — but we can look at some of those that commonly occur.
First, there are the properties used to describe relationships betweenSpatial Things in agazetteer. These properties are often used in combination with spatial predicates to describe the relationship between administrative units. For example,Ordnance Survey define specific properties to describe the relationships between the administrative units used within the UK:county (admingeo:county)
,district (admingeo:district)
,ward (admingeo:ward)
, etc.
@prefixrdfs: <> .@prefixgeosparql: <> .@prefixadmingeo: <> .<> aadmingeo:District ; rdfs:label"City of Edinburgh" ; admingeo:gssCode"S12000036" ; admingeo:ward <> , <> , <> , ... ; geosparql:sfTouches <> , <> , <> , <> ; ... .
The example snippet above, provided in [TURTLE] format, shows the relationships between theCity of Edinburgh (osuk:7000000000030505)district (admingeo:district) and theelectoral wards (admingeo:ward) it contains. Also note that complementary use ofgeosparql:sfTouches
to relate theCity of Edinburgh (osuk:7000000000030505) to its adjacent districts; Midlothian, West Lothian etc.
A second domain where relationships betweenSpatial Things and non-spatial resources occur is earth observing. The example below, provided in [GML], relates a monitoring point at Deddington on the Nile River, Tasmania, to the sensor that is deployed there (using thesams:hostedProcedure
property) and relates that monitoring point to the waterbody whose properties are being measured (using thesam:sampledFeature
property). Here, thelinks are defined using [XLINK11].
<wml2:MonitoringPointgml:id="xsd-monitoring-point.example"xmlns:wml2=""xmlns:gml=""xmlns:sam=""xmlns:sams=""xmlns:xlink=""><gml:description>Hydrological monitoring point for Nile river at Deddington, South Esk catchment, Tasmania</gml:description><gml:identifiercodeSpace=""></gml:identifier><sam:sampledFeaturexlink:href=""xlink:title="Nile river"/><sams:shape><gml:Pointgml:id="location_deddington"><gml:possrsName="urn:ogc:def:crs:EPSG::4326"> -41.814935 147.568517</gml:pos></gml:Point></sams:shape><sams:hostedProcedure><wml2:ObservationProcessgml:id="sensor:4c40fd3acdbf"><wml2:processTypexlink:href=""xlink:title="Sensor"/><wml2:processReferencexlink:href=""xlink:title="Sensor configuration (updated:2017-03-13)"/></wml2:ObservationProcess></sams:hostedProcedure> ...</wml2:MonitoringPoint>
For further information about sensors, sampling, observations and measurements, please refer to [OM-XML] and [VOCAB-SSN].
[GML] adopted the [XLINK11] standard to representlinks between resources. At the time of adoption, XLink was the onlyW3C-endorsed standard mechanism for describing links between resources withinXML [XML11] documents. TheOpen Geospatial Consortium anticipated broad adoption of XLink over time — and, with that adoption, provision of support within software tooling. While XML Schema, XPath, XSLT and XQuery etc. have seen good software support over the years, this never happened with XLink. The authors of [GML] note that given the lack of widespread support, use of XLink within [GML] provided no significant advantage over and above use a bespoke mechanism tailored to the needs of [GML].
Our final example of a domain-specific relationship concerns creative works. For example, one may want to indicate the location a social media message was sent from. In the example below, we assume that Maurits, a tourist in Amsterdam, wants to comment on his visit to Anne Frank's House. His social media App uses the [GEOLOCATION-API] to determine his location (Lat=52.37590
) and suggests several places that Maurits might choose from in order to geo-tag his message. Maurits wants people to know roughly where he is, so he chooses "Amsterdam-Centrum" and presses 'send'. The App encodes the message in [SCHEMA-ORG] and pushes the message to the server for distribution. The geo-information is provided using theschema:locationCreated
<scripttype="application/ld+json">{"@context" : {"@vocab" :"" },"@id" :"","@type" :"Message","sender" : {"@type" :"Person","name" :"Maurits" },"datePublished" :"2017-03-12","locationCreated" : {"@id" :"""@type" :"Place","name" :"Amsterdam-Centrum" }}</script>
If Maurits had wanted to indicate that the subject of the photograph he took moments later was Leliesluis bridge, then the following [SCHEMA-ORG] markup andschema:mainEntity
property could be used:
<scripttype="application/ld+json">{"@context" : {"@vocab" :"" },"@id" :"","@type" :"Photograph","sender" : {"@type" :"Person","name" :"Maurits" },"datePublished" :"2017-03-12","mainEntity" : {"@id" :"""@type" :"Bridge","name" :"Leliesluis bridge","geo" : {"@type" :"GeoCoordinates","longitude" :"4.88435","latitude" :"52.37608" } }}</script>
Check that hyperlinks use typed relationships, and thatlink relation type can be located in order to determine how to interpret the hyperlink.
Check that the source and target of the hyperlink areSpatial Things unless thelink relation type definition indicates that this should be otherwise (e.g. when relating a Spatial Thing to itsgeometry).
Relevant requirements:R-Linkability,R-MachineToMachine,R-SpatialRelationships,R-SpatialOperators.
Spatial things and their attributes can change over time. For example, a lake may grow or shrink due to changes in climate, water extraction or any number of reasons. For many applications, it is important that information aboutSpatial Things is kept up to date. When new information is available, the data publisher may make this available on the Web according to their update schedule and policies. [DWBP]section 8.6 Data Versioning andBest Practice 21: Provide data up to date provide directly applicable guidance.
When dealing with change to aSpatial Thing, you should consider its lifecycle; in particular, how much change is acceptable before a Spatial Thing can no longer be considered as the same resource. Consider Eddystone Lighthouse for example: the “Eddystone Light”, a maritime navigation aid, has existed in (more or less) the same place on Eddystone Rocks since 1698. A single HTTP URI (such as
) is used to identify “the lighthouse on Eddystone rocks” for all that period. The lighthouse's attributes (such as its focal height, visible range and light characteristic) have changed over that period, but we still consider it to be the same lighthouse. However, if our interest is historic buildings, we would identify the four different structures that have stood on that site as different Spatial Things, from Winstanley's Eddystone Lighthouse (the first incarnation) to Douglass' Eddystone Lighthouse (the 4th and current incarnation). In that context, incremental change for these structures during the entire period from 1698 is not appropriate; one structure replaces another and so each structure should be assigned a unique identifier. In summary, different things are important to different people!
All that said, if you consider that the change affects the fundamental nature of theSpatial Thing, then you should assign a new identifier. See13.1.1Spatial data identifiers for more details. Otherwise, read on for guidance on how to describe properties that change over time.
Spatial data should include metadata that allows a user to determine when it is valid for.
Spatial things and their attributes change over time. When it comes toSpatial Things, orany resource, that changes over time, it is important to provide metadata about the life cycle of those entities and the resources used to describe them. Given that information, data consumers can make considered choices about which resource they want to link to. Mostly, they are interested in current information. They need to be able to determine whether the published description of a Spatial Thing meets their needs. For example, is the published geographicextent of the City of Amsterdam relevant for a land-usage study of the nineteenth century? (, "Municipality History", illustrates how the extent of Amsterdam has changed during the past 200-years, inHTML andGeoJSON). Where the information is available, a user may want to browse older versions of the published information to understand the nature of any changes or to find historical information.
Users are provided with the most recent version of information about aSpatial Thing and its attributes by default.
Users can determine the time period for which data is applicable.
If a version history of changes is available, users can browse through a set of changes to see how aSpatial Thing and its attributes have changed over time.
When publishing information about aSpatial Thing that is subject to change there are four approaches to consider in response to a change:
Whichever approach is chosen, publishers ofspatial data should consider how dataset metadata plays an important part in helping users determine whether a dataset is fit for their use. Particularly where the contents of a dataset change with time, statements about the (most recent) publication date, the frequency of update and the time-period for which the dataset is relevant (i.e. temporal extent) should be provided. Please refer to [DWBP]section 8.2 Metadata for more details about dataset metadata.
A description of the lifecycle of theSpatial Things (e.g. what triggers a change and whether those changes are versioned etc.) should also be provided in either the dataset's metadata, schema or specification. For example, the European Commission'sINSPIRE Regulation on interoperability of spatial data sets states that data publishers should provide lifecycle information; the technical guidance for most themes recommends how the data publisher's specific rules should be published.
Approach (1) is lightweight and should only be used where there are no user requirements that require access to older descriptions of theSpatial Things. Data publishers simply replace the old description of the Spatial Thing with the amended description and keep users informed about updates by providing the appropriate metadata (e.g. when the data was changed). This may be achieved using dataset metadata (as outlined above) or by including the metadata attributes in the description of each Spatial Thing.
Where users are anticipated to need to understandhow aSpatial Thing has changed over time, approaches (2), (3) and (4) should be considered.
Approach (2) is a simple variant of approach (1); the difference being that the entire dataset is assigned a new URI when changes are made, thereby enabling older versions of the dataset to be addressed separately. See [DWBP]Best Practice 11: Assign URIs to dataset versions and series for further details. Using this approach, a user should be able to compare two versions of the dataset to determine what has changed. Although simple for data publishers, the downside of this approach is that the effort is passed on to the users.
Approach (3) requires the data publisher to publish immutable resources that describe theSpatial Thing at specific points in time (i.e. "snapshots") and provide a mechanism for users to browse between those snapshots. Effectively, the dataset becomes an accumulation of these snapshots that users can browse through. However, given that each snapshot of the Spatial Thing is published as a separate resource, this approach is suited to infrequent changes so that the number of snapshots does not become unwieldy.
The URI for theSpatial Thing, thebase URI, should dereference to provide the current information and alink to its version history of snapshots. [DWBP]Best Practice 8: Provide version history describes how a version history may be implemented. Each snapshot resource within the version history must be uniquely identified; a common approach is to append a date/time stamp to the base URI as a version indicator. [DWBP]Best Practice 7: Provide a version indicator provides relevant guidance.
Theextent of the City of Amsterdam has changed during the last 200 years. This example, based ("Municipality history") (condensed and changed to reflect the recommendations in this best practice), shows how the version history of Amsterdam's boundary can be provided as a series of immutable snapshots in GeoJSON [RFC7946].
The current information on Amsterdam including the current boundary:
{"uri":"","name":"Amsterdam","inProvince":"Noord-Holland","cbscode":"0363","absorbed":"","2016": {"type":"FeatureCollection","features": [{"type":"Feature","versionedUri":"","replaces":"","year":"2016","geometry": {"type":"MultiPolygon","coordinates": [...],}}]}}
The previous boundary of Amsterdam:
{"2014": {"type":"FeatureCollection","features": [{"type":"Feature","versionedUri":"","replacedBy":"","replaces":"","year":"2014","geometry": {"type":"MultiPolygon","coordinates": [...],}]}}
Approach (4) is suitable where aSpatial Thing has a small number of attributes that are frequently updated. For example, the GPS position of a runner or when streaming data from a sensor, such as the water level from a stream gauge.
With this approach, the description of theSpatial Thing must include a property that contains a sequentially-ordered set of data points, each of which defines a time-stamp and the values for the time-varying attribute(s). By definition, this property can be considered as a time-seriescoverage. Standard data encodings are available for time-series data, including: [TIMESERIESML] for [GML], plus [COVJSON-OVERVIEW] and theSensorThingsAPI [SENSORTHINGS] forJSON [RFC7159]. [VOCAB-DATA-CUBE] provides a generic mechanism to express well-structured data, such as time series, inRDF [RDF11-PRIMER]. Although not yet widely used enough to be consideredbest practices, [EO-QB] and [QB4ST] (developed alongside this best practice Note within theSpatial Data on the Web Working Group) illustrate how [VOCAB-DATA-CUBE] may be used in this way.
TheOGC [MOVING-FEATURES-XML] and [MOVING-FEATURES-CSV] specifications follow the pattern described above. Atrajectory
element is used to describe the position of aSpatial Thing, and varying attributes (such as orientation or rotation) can be added alongside the tuples in the trajectory. However, there is limited evidence of adoption outside of Japan.
This example shows a snippet of a file storing the changing GPS position ofa runner traversing the Alps. The format isGPX, a common format for exchanging a series of GPS positions. For each track point, the coordinates as well as a timestamp are stored.
<gpxversion="1.1"><trk><name>Move</name><trkseg><trkptlat="47.24239"lon="10.749514"><ele>784</ele><time>2016-09-06T06:01:25.009Z</time></trkpt><trkptlat="47.242403"lon="10.749489"><ele>784</ele><time>2016-09-06T06:01:26.009Z</time></trkpt> [...]<trkptlat="46.968127"lon="10.870573"><ele>1677</ele><time>2016-09-06T17:41:50.009Z</time></trkpt></trkseg></trk></gpx>
Information about a givenSpatial Thing, or set of Spatial Things, will be relevant for a particular time or time-period. Check that this information is stated.
Check that dataset metadata provides details about how often the dataset is updated; e.g. date of the most recent publication, and frequency of update.
If a version history of changes is available, check thatlinks to previous versions are available.
If the Spatial Thing contains an attribute that varies with time, check that those attribute values are provided as a time-series.
Relevant requirements:R-MachineToMachine,R-MovingFeatures,R-Streamable,R-CoverageTemporalExtent
Spatial data, especially spatial data hosted in spatial data infrastructures, are usually expected to conform to specific schema definitions. Schema definitions may be open standards and are often based on open standards, for example [GML]. Schema definitions help to verify the conformity of the spatial data set to the semantic and syntactic requirements. They specify how the general standard such as [GML] is used in specific application contexts. They are sometimes called "application schemas" or "application profiles". Where the schemas are documented as XML Schemas, JSON Schemas, or in other formal languages, these "executable test suites" can be used to check the consistency and conformity of your data.
If you publish spatial data which conforms to a specific data schema, that schema should be published online using unique persistent HTTP URIs and referenced in your dataset.
Spatial objects and feature collections published on the web should be verifiable with respect to the intentions of the original data publishers. Data schemas provide the means to validate data types used in a given dataset, to check constraints of values used in the dataset and to check the dataset's logical consistency.
Spatial data which validate in a given schema expression can be considered of higher quality, as they conform to the requirements which were originally set out by the data publishers.
Spatial data should reference data schemas which describe how spatial data can be validated. In this way, data validation using a machine becomes possible and the consistency of datasets related to the same specification can be assured.
[DWBP]Best Practice 10: Use persistent URIs as identifiers within datasets provides directly applicable guidance when identifying resources. It advises:
This best practice can be directly applied to hosting data schemas at trusted organizations or simply along with the published datasets. Data schemas need to be hosted under a unique and persistent URI and data need to reference the data schema appropriately in their respective data structures. XML-based data may reference to XML schemas, JSON-based data may reference JSON-schemas. Linked open data might be verified using constraints in OWL or SHACL.
Check that within an applicable geospatial data set, there is a reference to a data schema and that this reference is dereferenceable.
Relevant requirements:R-Validation
In recent years, we have seen the widespread emergence of Web applications that usespatial data. Often these applications do not access all the spatial data they use via the Web. While there are good reasons for this, e.g. licensing restrictions, it is often the case, too, that the spatial data is not available via the Web at all, or in ways that application developers find too complex to use, or with insufficient or unclear quality-of-service commitments.
[DWBP] provides best practices discussing access to data using Web infrastructure (see [DWBP]section 8.10 Data Access). This section provides additional insight for publishers ofspatial data.
This section is aboutAPIs for access tospatial data. This includes access to metadata that is shared with the dataset. Access to metadata that is published separately in catalogs is covered inBest Practice 15: Include spatial metadata in dataset metadata.
Making data available on the Web requires data publishers to provide some form of access to the data. There are numerous mechanisms available, each providing varying levels of utility and incurring differing levels of effort and cost to implement and maintain. Publishers of spatial data should make their data available on the Web using affordable mechanisms to ensure long-term, sustainable access to their data.
When determining the mechanism to be used to provide Web access to data, publishers need to assess utility against cost. In order of increasing usefulness and cost:
Let's take a closer look at these options.
The download of a dataset — or a pre-defined subset of it — via a single HTTP request is mainly covered by these [DWBP] best practices:
Providing bulk download or streaming access to data is useful in any case and is relatively inexpensive to support as it relies on the standard capabilities of Web servers for datasets that may be published as downloadable files stored on a server. However, this option is more complex for frequently changing datasets or real-time data.
[DWBP]Best Practice 18: Provide Subsets for Large Datasets explains why providing subsets is important and how this could be implemented. Spatial datasets, particularlycoverages such as satellite imagery, sensor measurement time series and climate prediction data are often very large. In these cases, it is useful to provide subsets by having identifiers for conveniently sized subsets of large datasets that Web applications can work with.
Effectively, breaking up a largecoverage into pre-defined lumps that you can access via HTTP GET requests is avery simpleAPI.
When a subset is provided, this should include information about the relationship to the complete dataset. In HTML, this could be descriptive text or it is implicitly clear for humans in the way the subset is presented. In [SCHEMA-ORG] it could beschema:isPartOf property. InRDF [RDF11-PRIMER],PROV-O could be used to describe the relationship between the subset and the complete dataset as well as the mechanism used to derive the subset. In ISO 19115 metadata, the LI_Lineage element may be used for a similar purpose. Etc.
The use ofAPIs to access data is covered in [DWBP] by the following best practices:
Forspatial data,SDIs have long been used to provide generalized access to spatial data via Web services, typically using open standard specifications from theOpen Geospatial Consortium (OGC). In traditional SDIs, these web services, such asWeb Feature Service [WFS], were XML based and difficult for non-expert users. TheseOGC standards have not seen widespread adoption beyond the geospatial expert community. This has changed with the release of a number of resource-orientedOGCAPIs (e.g.OGCAPI - Features [OAF1]), which are aligned to generally accepted patterns and practices in the web community, and specifically to the recommendations described in this Best Practice document.
In addition, commercial offerings for publishingspatial data on the Web often provide access via product-specificAPIs, too. TheseAPIs are typically not restricted to HTTP-based Web serviceAPIs in the sense of [DWBP]Best Practice 24: Use Web Standards as the foundation ofAPIs, but includeAPIs targeted at a specific programming language, for example, JavaScript.
In the list of options above, the third option - BespokeAPI - is included because sharingspatial data on the Web using the first two options (bulk download or generalizedAPIs) may not be sufficient for reaching application developers. Reasons for this include:
Sharingspatial data on the Web using a spatial dataAPI based on modernOGCAPI standards, which are based on Spatial Data on the Web Best Practices, is often sufficient for reaching application developers. It provides convenience to developers of the targeted applications, because theAPI designer has thought about the needs of those developers when consuming thespatial data shared via theAPI.
If you have a specific type of application in mind for your data, tailor a spatial data accessAPI to meet that goal.
The OGC Architecture Board has been trying to define the term "convenience API". I pointed them to the description in This could be the basis of their definition, but it seemed from the discussion I had with them that they were looking for a somewhat less narrow definition.
I re-read the BP myself and think that the descriptions underWhy andIntended outcome provide good input for a definition.
Some key quotes from the BP:
tailored to meet a specific goal; enabling a user to engage with complex data structures using (a set of) simple queries
provides a coherent set of queries and operations, including spatial ones, that help users get working with the data quickly to achieve common tasks. The API provides both machine readable data and human readable HTML markup. The human-readable markup will also support search engine's Web crawlers to enable indexing of spatial data.
... And there's more underPossible approach for implementation, although these are perhaps too specific for a definition - they are more like suggestions on how to make a spatial data API convenient:
well documented and easy to understand, both in terms of the options to access / filter the data and of the data structures that are returned
Return data in chunks fit for use in Web applications and as useful sets of information.
simplifying the geometries
overly small pieces of data are inconvenient to use
Support queries for Spatial Things based on user needs
Do we still think all these are aspects of (spatial) convenience APIs? Is there more?
Providing access tospatial data via bulk download would be too complex for application developers with relatively simple requirements, if the spatial data is complex to understand or too large to handle in a Web application. Providing access via traditional SDIs is not recommended because they are often not easy to understand and the "Time to First Successful Call" (see [DWBP]Best Practice 25: Provide complete documentation for yourAPI) may be too high for application developers.ConvenienceAPIs are tailored to meet a specific goal; enabling a user to engage with complex data structures using (a set of) simple queries, including spatial search.
TheAPI provides a coherent set of queries and operations, including spatial ones, that help users get working with the data quickly to achieve common tasks. TheAPI provides both machine-readable data and human-readable HTML markup. The human-readable markup will also support search engine's Web crawlers to enable indexing ofspatial data.
TheAPI should:
The bulk of these recommendations can be satisfied by usingOGCAPI building blocks. For example, if it is convenient for users to be able to search for spatial data using a bounding box, theOGCAPI FeaturesBounding Box building block can be incorporated into theAPI.
TheEnvironment Agency Bathing Water QualityAPI is implemented using the Epimorphic'sELDA implementation of theLinked DataAPI and enables configured queries against (general)SPARQL [SPARQL11-OVERVIEW] endpoints to be exposed as RESTful Web services.
Use ofOpenSearch to findSpatial Things. For spatial or temporal searches use theOGC Geo and Temporal extensions.
TheAPIs support both textual and spatial searches.
In a White Paper about open geospatialAPIs [OGC-API-WP], the Open Geospatial Consortium (OGC) has defined the concept of the "OGCAPI Essentials" — a set of items defined inOGC standards and other open standards that are reusable modules for use in geospatialAPIs. The White Paper providesan initial list and many of the identified standards are mentioned in this document. Reuse of standardized building blocks improves consistency and interoperability acrossAPIs. It is recommended to consider theOGCAPI Essentials, and consult theOGCAPI Building Blocks Register when defining anAPI to accessspatial data.
One such essential is a set of well-known spatial predicates for use in queries to selectSpatial Things based on theirgeometry. Most commonly supported is the following set:equal,disjoint,touches,within,overlaps,crosses,intersects,contains. These predicates were originally defined in [SIMPLE-FEATURES], but are also supported by [GeoSPARQL] and others. For more information about the definition of the predicates, see [SIMPLE-FEATURES].
If the data is already published in a traditionalSpatial Data Infrastructure, it is possible to putOGCAPI facades on top of WxS as a transitional approach. However, for anOGC WebAPI in production, it is recommended to directly access the data source, typically some database.
See the "How to test" sections in [DWBP]Best Practice 23: Make data available through anAPI, [DWBP]Best Practice 24: Use Web Standards as the foundation ofAPIs and [DWBP]Best Practice 25: Provide complete documentation for yourAPI.
Relevant requirements:R-Compatibility,R-LightweightAPI,R-SpatialOperators,R-ReferenceDataChunks.
If you expose spatial data in variousCRSs via anAPI or other data access endpoint, offer a way for users to find out which CRSs are available, to do requests, and to access geometries in the CRS of their choice.
It is often useful to makegeometries available in differentCRSs.Best Practice 7: Choose coordinate reference systems to suit your user's applications describes why this is a good idea as well as how to decide which CRSs to provide.Best Practice 8: State how coordinate values are encoded explains how the CRS of geometries should be made known. It follows that the default CRS that is offered should be WGS-84; and further that users should be able to find out which other CRSs are available and access geometries in the CRS of their choice.
The endpoint allows the discovery of the supportedCRSs and provides the ability to access geometries in the CRS of the user's choice.
It is generally recommended to limit the number of supportedCRSs in a data dissemination endpoint for clarity. Only supportCRSs that make sense for the data. If aCRS doesn't cover the data, do not support it.
Offering geospatial data in differentCRS in practice means the data needs to be transformed from the orinigal CRS to other supported CRSs. Database software and libraries for most programming languages are available which can do this. Reprojection can be done in advance and stored, or calculated at the time of a request. If reprojecting on request, since geographic data can be large and complex, it is recommended to cache the converted data to eliminate the need of reprojecting the same data more than once.
For WebAPIs, CRS support should be offered in conformance to theOGCAPI building blocks related to CRS [OAF2]. These building blocks can be supported in any WebAPI:
property which contains the identifiers for the list of CRSs supported by the server for that collection.bbox-crs
parameter in order to do
parameter to this end. If thecrs
parameter is absent, spatial features are returned in the defaultCRS, which is WGS-84 (that is, for coordinates without ellipsoidal height and for coordinates with ellipsoidal height).{"links": [ {"href":"/collections","rel":"self","type":"application/json","title":"this document" } ],"crs": ["","","","","" ],"collections": [ {"id":"Inspire_RCE rce_inspire_points","title":"Inspire_RCE rce_inspire_points","description":"","extent": {"spatial": {"bbox": [ [13854,306993.058008078,277502.058333333,617910 ] ],"crs":"" } },"itemType":"feature","links": [ {"href":"/collections/Inspire_RCE rce_inspire_points/items","rel":"items","type":"application/geo+json","title":"Inspire_RCE rce_inspire_points: items" } ],"crs": ["#/crs" ] } ]}
curl -X'GET' \'' \ -H'accept: application/json' Request URL Response body {"links": [ {"href":"collections/Inspire_RCE rce_inspire_points/items/1","rel":"self","type":"application/json","title":"this document" }, {"href":"collections/Inspire_RCE rce_inspire_points","rel":"collection","type":"application/json","title":"the collection containing this feature" } ],"feature": {"id":"Inspire_RCE rce_inspire_points.1","bbox": [4.90005196383485,52.3741456950321,4.90005196383485,52.3741456950321 ],"geometry": {"type":"Point","coordinates": [4.90005196383485,52.3741456950321 ] },"geometry_name":"geom","properties": {"id":1,"fid":"1","localid":"32478","namespace":"nlps-rijksmonumenten","versionid":"2021-11-30T15:36:47Z","legalfoundationdate":"2021-11-30T15:36:47Z","ci_citation":"","designationscheme":"","disignation":"onroerend gebouwd","percentageunderdesignation":"100","language":"nld","text":null,"script":"Latn","siteprotectionclassification":"cultural" },"type":"Feature" }}
In linked data endpoints, [GeoSPARQL] can be used to support user-requestedCRSs.
In [GeoSPARQL] thegetSRID
function returns the spatial reference system of ageometry, thus making it possible to request a specific CRS at a (Geo)SPARQL endpoint.
SELECT ?geo_wkt WHERE {?geo geo:asWKT ?geo_wkt .FILTER(geof:getSRID(?geo_wkt)==}
Check if a test client can discover the supported CRSs, request spatial things using a bounding box in one of the supported CRS, and if the spatial things are returned in the requested CRS.
Relevant requirements:R-MultipleCRSs,R-AvoidCoordinateTransformations, andR-DeterminableCRS.
[DWBP] provides best practices discussing the provision of metadata to support discovery and reuse of data (see [DWBP]section 8.2 Metadata for more details). Providing metadata at thedataset level supports a mode of discovery well aligned with the practices used inSpatial Data Infrastructure (SDI) where a user begins their search forspatial databy submitting a query to a catalog. Once the appropriate dataset has been located, the information provided by the catalog enables the user to find a service end-point from which to access the data itself — which may be as simple as providing a mechanism to download the entire dataset for local usage or may provide a richAPI enabling the users to request only the required parts for their needs. The dataset-level metadata is used by the catalog to match the appropriate dataset(s) with the user's query.
This section includes best practices for including the spatialextent,CRS, and other spatial details of the dataset in the metadata. These are the extra metadata items needed to make spatial datasets both discoverable and usable. A third best practice in this section goes a step further in granularity: exposingspatial data on the Web in such a way that individual entities or "granules" within a dataset can be discovered, evaluated, and utilized.
Quality information is also an important part of spatial metadata, especially for asserting if data is fit for a certain purpose. [DWBP] provides a best practice discussing how the quality of data on the Web should be described (see [DWBP]section 8.5 Data Quality for more details). This section is based on the Data Quality section from [DWBP] and adds a best practice specific for spatial data, which concentrates on the accuracy of the positions in the data — how close are they to the actual positions of the real-world things?
In the Spatial Metadata section, we provided aBest Practice on how to deal withCRS inspatial data on the Web. There is also a clear link between CRS and data quality, because the accuracy of spatial data depends for a large part on the CRS used. This can be seen as conformance of data with a "standard" — in this case, a (spatial or temporal) reference system. This is how you can describe spatial data quality using different vocabularies. We will provide an example in this section.
For some uses, it may be sufficient to simply state conformance to a published specification:
a:Dataset a dcat:Dataset ; dcterms:conformsTo <> .<> a dcterms:Standard , foaf:Document ; dcterms:title "COMMISSION REGULATION (EU) No 1089/2010 of 23 November 2010 implementing Directive 2007/2/EC of the European Parliament and of the Council as regards interoperability of spatial data sets and services"@en ; dcterms:issued "2010-12-08"^^xsd:date .
However, that specification makes no statement about the positional accuracy of the data, so on its own, it is only a useful quality statement for users to whom positional accuracy is not that important.
The description of datasets that haveSpatial Things should include explicit metadata about their spatialextent, coverage, and representation
Since location is such a powerful organizing principle, it is usually necessary to specifically describe the spatial details and nature of a dataset to discover it as well as to determine its fitness for use. This information is used, for example, bySDI catalog services that offer spatial querying to find data — but also by users to understand the nature of the dataset. In some cases, for example when dealing with crowd-sourced data, provenance information or how the dataset came to be in its published form and with what quality, is important as well.
The first level of spatial description is the spatialextent of the dataset, the area of the world that the dataset describes. This often suffices for initial discovery, but further levels of description are needed to evaluate a dataset for use. These include the dataset spatial coverage (continuity, resolution, properties) as well as the spatial representation or geometric model (for example, gridcoverage, discretecoverage, point cloud, linear network).
Dataset quality measures such as positional accuracy are also important for determining applicability. In the case of datasets whose spatial characteristics vary over their temporal duration, spatial descriptions must include an explicit temporal aspect.
When publishing a dataset, provide as much spatial metadata as necessary, but at least the spatialextent, coverage, and representation. Other examples of spatial metadata include:
InSpatial Data Infrastructures, the accepted standard for describing metadata is [ISO-19115] or profiles thereof.
To provide information about the spatial attributes of the dataset on the Web one can:
Again, use [VOCAB-DCAT-2], but instead of a reference to a named place, use a set of coordinates to specify the boundaries of the area either as a bounding box or a polygon — seeExample 19.
Use [GeoDCAT-AP] to specify spatial attributes that are not available in [VOCAB-DCAT-2] — seeExample 60 andExample 66.
The experimentalGeoDCAT-APAPI allows data publishers to serve [ISO-19115] records in differentRDF serialization formats, [HTML-RDFa] included, on top of a geospatial catalog, by using the standard [CSW] query interface, and supporting HTTP content negotiation.
[GeoDCAT-AP] models this information by usingadms:representationTechnique
[VOCAB-ADMS], with URIs corresponding to the items in the appropriateISO 19115 code list.
The following [TURTLE] snippet provides an example of the [GeoDCAT-AP] specification of two datasets using, respectively, a vector and a grid spatial representation type. The URIs in the example, denoting the spatial representation type, are taken from the corresponding code list of theINSPIRE Registry.
a:Dataset a dcat:Dataset ; adms:representationTechnique <> .another:Dataset a dcat:Dataset ; adms:representationTechnique <> .
Quality, trust and density levels of crowd-sourced data varies and it is important that the data is provided with contextual information that helps people judge the probable completeness and accuracy of the observations. Human-readable and machine-readable metadata should be provided with crowd-sourced data.
An example of crowd-sourced data that is being put to use is theTwitter hashtag #uksnow for snowfall observations, which are shown on the#uksnow Map. In this case, the Twitter accounts from which observations originate are shown, giving users an idea of the source and its trustworthiness.
Check if the spatial metadata for the dataset itself includes the overall features of the dataset in a human-readable format.
Check if the descriptive spatial metadata is available in a valid machine-readable format.
Relevant requirements:R-Discoverability,R-Compatibility,R-BoundingBoxCentroid,R-Crawlability,R-SpatialMetadata andR-Provenance.
Accuracy of spatial data should be specified in machine-interpretable and human-readable form.
So far none of the implementation reports (GNAF,NRW,PDOK) implementBP 14: Describe the positional accuracy of spatial data.
Notably it was felt in the GNAF implementation that there was a need for some form of code list to facilitate the discovery of meaning - I believe this view was also shared by@andrea-perego
Currently there is no detailed discussion of a code list approach in the best practices. This could be an enhancement needed to facilitate implementation.
The amount of detail that is provided inspatial data and the resolution of the data can vary. No measurement system is infinitely precise and in some cases the spatial data can be intentionally generalized (e.g. merging entities, reducing the details, and aggregation of the data) [Veregin]. Some spatial data applications, such as aircraft navigation, require highly accurate data. For others, such as human navigation, a horizontal accuracy of a few meters is good enough. For yet others, such as overlaying weather forecasts on a map, the map is only giving a general indication of place. If the positional accuracy is published together with the data, the user can determine whether it is appropriate to use for their application. Potentially, this makes existing data more reusable.
It is important to understand the difference between precision and accuracy. Seven decimal places of alatitude degree correspond to about one centimeter. Whatever the precision of the specified coordinates, the accuracy of positioning on the actual earth's surface using WGS 84 will only approach about a meter horizontally and may have apparent errors of up to 100 meters vertically, because of assumptions about reference systems, tectonic plate movements and which definition of the earth's 'surface' is used.
For many uses, the positional accuracy of the data is an important aspect of assessing its fitness for purpose (quality). As with other data quality statements, this can be a quantitative measure, a statement of conformance to a standard or policy, or an assertion or report of fitness for a particular purpose.
Describe the accuracy ofspatial data in a way that is understandable for humans.
In addition, describe the accuracy of spatial data in a machine-readable format. [VOCAB-DQV] is such a format. It is a vocabulary for describing data quality, including the details of quality metrics and measurements.
For observed (measured) datasets, it is possible to make specific quantitative statements about positional accuracy, based on knowledge of the equipment used to make the observations, and any processing carried out.
Forcoverages, the sampling distance is an effective way of indicating the amount of detail in the dataset — this is one of the meanings of the term "resolution". Alternatively, samples of the data could be independently checked against the real world, and the results of that check reported. Either way, this is usually a statement ofabsolute positional accuracy, but for some uses, relative positional accuracy is more important.
Positional accuracy measurements, whether observed or asserted based on process, can be given using QualityMeasurement.
For modelled datasets, for example in planning and construction, there is no 'real world' against which to assess the positional accuracy — but relative positional accuracy can still be stated.
For many uses, a statement of the amount of detail provided is sufficient to assess fitness for purpose; examples include "level of detail" (building models), "navigational purpose" (marine navigation), "equivalent scale" or "zoom level" (cartography). Sometimes, this is expressed as if it were a statement of positional accuracy.
These can be expressed in the same way as for non-spatial data; for example, using theQualityAnnotation
, andQualityPolicy
statements of [VOCAB-DQV].
The following example shows how [VOCAB-DQV] can express conformance to a specified positional accuracy
a:Dataset a dcat:Dataset ; dcterms:conformsTo <> .<> a dcterms:Standard , foaf:Document ; dcterms:title"IHO Standards for Hydrographic Surveys"@en ; dcterms:issued "2008-02-01"^^xsd:date ;.
The following example shows how [VOCAB-DQV] can express the amount of detail in acoverage dataset:
:myDataset a dcat:Dataset ; dqv:hasQualityMeasurement :myDatasetPrecision, :myDatasetAccuracy ;.:myDatasetPrecision a dqv:QualityMeasurement ; dqv:isMeasurementOf :spatialResolutionAsDistance ; dqv:value "1000"^^xsd:decimal ; sdmx-attribute:unitMeasure <> ;.:spatialResolutionAsDistance a dqv:Metric; skos:definition "Spatial resolution of a dataset expressed as distance"@en ; dqv:expectedDataType xsd:decimal ; dqv:inDimension dqv:precision ;.:myDatasetAccuracy a dqv:QualityMeasurement ; dqv:isMeasurementOf :spatialAccuracy ; dqv:value "98.2"^^xsd:decimal ; sdmx-attribute:unitMeasure <>. :spatialAccuracy a dqv:Metric; skos:definition "Percentage of spatial elements that are found accurate according to methodology XYZ"@en ; dqv:expectedDataType xsd:decimal ; dqv:inDimension ldqd:semanticAccuracy ; .
In its original version,Example 64 for some reasons did not include the statements describing:myDatasetAccuracy
, which are available from the reference [VOCAB-DQV] examples inthe relevant section.
To be decided if they should be kept, revised, or dropped.
This example was taken from [VOCAB-DQV]. For more examples of expressingspatial data precision and accuracy see [VOCAB-DQV],Express dataset precision and accuracy.
The following paragraphs have been added in order to update the BP wrt to [VOCAB-DCAT-2] and [GeoDCAT-AP-20201223].
The [VOCAB-DQV] approach is recommended also in [VOCAB-DCAT-2] as a general solution to specify precision and accuracy. However, in order to address the most common case of spatial resolution (i.e., as horizontal ground distance), [VOCAB-DCAT-2] defines also a specific property,dcat:spatialResolutionInMeters
. By using this property,Example 64 can be re-written as follows:
:myDataset a dcat:Dataset ; dqv:hasQualityMeasurement :myDatasetAccuracy ; dcat:spatialResolutionInMeters "1000"^^xsd:decimal ;.:myDatasetAccuracy a dqv:QualityMeasurement ; dqv:isMeasurementOf :spatialAccuracy ; dqv:value "98.2"^^xsd:decimal ; sdmx-attribute:unitMeasure <> ;. :spatialAccuracy a dqv:Metric; skos:definition "Percentage of spatial elements that are found accurate according to methodology XYZ"@en ; dqv:expectedDataType xsd:decimal ; dqv:inDimension ldqd:semanticAccuracy ; .
Finally, [GeoDCAT-AP], building upon the [VOCAB-DQV] approach, defines specific individuals for the different types of spatial resolution in [ISO-19115] and [ISO-19115-1-2014] — namely:
) can be used if the distance isn't expressed in metres.These are illustrated in the following example:
resource:a12345 dqv:hasQualityMeasurement [ a dqv:QualityMeasurement ; dqv:isMeasurementOf geodcatap:spatialResolutionAsScale ; dqv:value"0.000001"^^xsd:decimal ] ;.resource:c34567 dqv:hasQualityMeasurement [ a dqv:QualityMeasurement ; sdmx-attribute:unitMeasure <> ; dqv:isMeasurementOf geodcatap:spatialResolutionAsAngularDistance ; dqv:value"0.02"^^xsd:decimal ] ;.resource:d45678 dqv:hasQualityMeasurement [ a dqv:QualityMeasurement ; sdmx-attribute:unitMeasure <> ; dqv:isMeasurementOf geodcatap:spatialResolutionAsVerticalDistance ; dqv:value"10.0"^^xsd:decimal ] ;.resource:b23456 dqv:hasQualityMeasurement [ a dqv:QualityMeasurement ; sdmx-attribute:unitMeasure
Check if the metadata contains at least one human and machine-readable statement regarding positional accuracy
Check that the kind of statement is relevant to the kind of data, e.g. not an absolute positional accuracy measure for Atlantis
Checking whether the accuracy statement is actually correct is beyond the scope of this best practice.
Relevant requirements:R-MachineToMachine,R-QualityPerSample.
Data ethics is a topic that has gained a lot of interest over the last few years, leading to the creation of frameworks, codes and guidelines on the topic. In the Data on the Web Best Practices [DWBP], which largely predates this growing interest, ethical concerns are mentioned once, insection 8.13 Data Enrichment. Ethical concerns may arise when data are "enhanced, refined or otherwise improved". For example, results or statistical outcomes may be distorted, privacy issues may arise when datasets are combined, and so on. In the case of spatial data, privacy concerns are especially obvious in the case of location tracking of individuals. It is therefore important to act in a responsible way when dealing with spatial data, especially the locations of individuals and mobility data.
Practitioners carry a certain responsibility to ensure their publications of spatial data on the web or the tools they develop that make it easy for others to work with spatial data, are ethical.
Spatial data may be seen as a fingerprint: For an individual every combination of their location in space, time, and theme is unique. The collection and sharing of individuals spatial data can lead to beneficial insights and services, but it can also compromise citizens' privacy. This, in turn, may make them vulnerable to governmental overreach, tracking, discrimination, unwanted advertisement, and so forth. Hence, spatial data must be handled with due care.
That being said, too often, data ethics is presented as a solution to avoid the unacceptable consequences of data misuse. However, acting responsibly is not only necessary out of fear of misuse, but more importantly, to unlock full potential of spatial data. Users will only contribute and apply spatial data if they trust the systems collecting these data and drawing inferences from them. These data may, in turn, improve the well-being and sustainability of our societies.
Practitioners carefully consider the impact of their interaction withspatial data on primary stakeholders (those impacted by the interaction with spatial data) and society as a whole.
There are many guidelines, principles, and legal frameworks that offer support on how to be a responsible data practitioner. However, very few focus on the unique characteristics ofspatial data within the broader realm of ethical use of data. Members of theW3C have published a note to raise awareness of the ethical responsibilities of both practitioners and users of spatial data on the web. The note illustrates the issues specifically associated with the nature of spatial data and both the benefits and risks of sharing this information implicitly or explicitly on the web. However, the note is not intended as a list of commandments. It is intended as a conversation starter on how its readers define "responsible use" of spatial data. In their own role and from their own perspective as the developer, the user, or the legislator interacting with spatial data.
A brief overview of the key pieces of advice the note has for responsible developers are:
A possible approach to implementation would be for practitioners to have conversations about these pieces of advice and implement those they align with.
The note offers further insights which can help and support with implementing this particular best practice. It is published under the name: “The Responsible Use of Spatial Data” [responsible-use-spatial].
Check if the application follows the guidelines in "The Responsible Use of Spatial Data” [responsible-use-spatial].
Relevant requirements: None
Besides the best practices in this document which have been observed in real-world applications, this section aims to highlight in-development standards which in the opinion of the authors might gain in relevance in the upcoming time.
All physical world objects inherently have a geographically-anchored pose. A real object in space can have three components of translation – up and down (z), left and right (x), and forward and backward (y) and three components of rotation – Pitch, Roll and Yaw. Hence the real object has six degrees of freedom.
The combination of position and orientation with 6 degrees of freedom of objects in computer graphics and robotics are usually referred to as the object’s “pose.” Pose can be expressed as being in relation to other objects and/or to the user. Some part of the object must be recognized as the anchor (or origin) of the position. When a pose is defined relative to a geographical frame of reference or coordinate system, it will be called a geographically-anchored pose, or GeoPose for short.
When a person seeks to view spatial data on the web, they may wish to see information or a map in position and oriented with respect to the observer (or another view point). Providing the view point's location and orientation with respect to a desired person, place or a thing (which also has a GeoPose) will permit the resulting perspective to accurately reflect the observer and the focus of attention in their respective positions and orientations.
Unfortunately, there is no standard for universally expressing the geographically-anchored pose in a manner that can be interpreted and used by modern computing platforms.
The purpose of the GeoPose SWG is to develop a standard for geographically-anchored pose (GeoPose) with 6 degrees of freedom referenced to one or more standardized Coordinate Reference Systems (CRSs).
In addition to the standard, the GeoPose SWG is developing guides for reviewers and implementers of GeoPose.
For more information, the GeoPose SWG description is foundhere. The draft specification and all work is being conducted in the open on theGeoPose SWG GitHub repository. TheGeoPose web site will be published shortly.
A “point of interest” (PoI) is a location for which information is available. A PoI can be as simple as a set of coordinates, a name, and a unique identifier, or more complex such as a three-dimensional model of a building with names in multiple languages information about opening and closing hours, and a civic address.
There are numerous use cases for PoI. They include location-based social networking, games, assessments of gaps or needs, mapping and navigation systems, etc.
End users may search databases of PoIs to identify properties for sale, financial institutions, accommodations, retail shops, or transportation. There are also numerous ways the public sector can use PoI data. For example, government agencies can provide information to citizens about services and locations by publishing their PoI datasets.
Unfortunately, there is no standard for universally expressing information about a point of interest in a manner that can be interpreted and used by modern computing platforms.
The purpose of the PoI SWG is to develop the PoI standard. The first goal of the PoI standard is interoperable PoI data and systems. Complying with this standard will permit systems that populate a PoI database regardless of authoring platform or application, to do so without transcoding, delays or costs that are incurred when data is compiled from many different contributors using proprietary formats.
Further, with a standard encoding, PoIs can be stored in open, non-proprietary formats and technology providers can focus on their respective competitive advantages.
When PoI publishers support this PoI standard, they will be able to make available PoI data and to transmit the data to the applications of the user’s choice, regardless of devices, and thereby focus on the value of the data, not the development and maintenance of proprietary applications or interfaces.
Furthermore, when data are encoded in compliance with the PoI standard, third parties are able to create, interact with, and query across platforms from multiple, diverse sources, to compare, merge, and, at the end of life cycle, to archive, PoI without loss of accuracy, metadata or value.
Finally, as a result of higher confidence in PoI data quality, validity, and security, a widely-adopted PoI standard will increase the use of and trust in PoI, in general.
More information is available on thePoI SWG description page and on thePoI SWG GitHub repository.
The Maps for HTML Community Group is an open, free public forum ofstakeholders who are interested in integrating maps and location technologies into browsers via Hypertext Markup Language (HTML) and related Web standards, especially including Cascading Style Sheets (CSS), the Document Object Model (DOM) and JavaScript. The community works in an open public Web space provided by the World Wide Web Consortium (W3C), so that all interested parties have equal opportunity to contribute to and comment on the objectives of the group.
The community’s interest is in integrating maps and location information technology into browsers, to simplify and standardize an accessible, performant, interoperable and privacy-enhancing Web map experience that can be created and used by persons and organizations of all abilities and with diverse needs, globally.
It is through the integration of geospatial coordinate semantics into browser engines, that we imagine unlocking the potential of an interoperable geospatial Web that allows us to virtually describe, document, search, navigate and save, our physical world.
On the one hand, there are well-established spatial semantics and standards, such as simple features, spatial referencing by coordinates, and map, tile, and feature services. On the other hand, there are the civilization-critical open Web standards, including HTML, HTTP, CSS, DOM, and JavaScript.
The keystone that enables integration of location information into browser engines and Web standards is the application of the Web architectural style to spatial semantics, which necessitates hypermedia controls, ranging from maps and layers to simple links between spatial resource representations.
The central product of the community is ourproposal to extend the HTML language with a small set of new and extended existing HTML elements. This extended subset of HTML is called “Map Markup Language” (MapML). The details of MapML are subject to change, as we perform ongoing research into best practices for usability, accessibility, performance and so on, for Web maps.
Our work is continuously reflected in updates to several public GitHub repositories. The repositories below organize our work and range from Use Cases and Requirements to end-user documentation of it.
TheUse Cases and Requirements for Web Maps is a pivotal document; once we can resolve each accepted use case into one or more agreed-upon requirements, this document will be final and will be used to measure the progress of all downstream products.
We intend to convene a formalW3C working group to help develop thedraft specification for MapML, which, once it addresses all requirements, will be merged with the HTML Living Standard.
Thespeculative polyfillsource code will implement the MapML specification in parallel to specification development, with elements in a polyfill-appropriate namespace. The logical behavior of the speculative polyfill will be transcribed, tested, and merged into browser source code repositories for Chrome/Blink, Safari/Webkit and Firefox/Gecko.
The Web Platform Tests repository that confirms the function of the speculative polyfill will be refactored to test the interoperability of browser changes resulting from the integration of MapML into the HTML Living Standard.
Theend-user documentation of the speculative polyfill will be merged with the Web platform documentation on the Mozilla Developers Network.
Our work will be complete when we have formulated and successfully merged pull requests into the various target repositories.
The best practices described in this best practice document are compiled based on evidence of real-world application, as described in4.4Best practice criteria. However, there are several issues that inhibit the use or interoperability ofspatial data on the Web, for which no evidence of real-world applied solutions is available. These issues are denoted “gaps in current practice”. In the case of gaps, there might be emerging practice i.e. a solution that has been theorized for a certain issue and has possibly been experimented on in beta settings, but not in production environments. Gaps and emerging practices in the area of publishing spatial data on the Web are discussed in this section.
The best practices, and also the gaps described in this document, focus on geometry-based spatial data, i.e., vector data. There are other types of spatial data, like coverages and meshes, but we do not discuss those in this section.
Different use cases may require geometries at different levels of accuracy, precision, and size.Best Practice 6: Provide geometries at the right level of accuracy, precision, and size outlines some of the approaches to address this requirement, considering general application scenarios and providing guidance on the criteria to be taken into account for choosing the appropriate technique (e.g., compress geometry data, use compact formats, apply geometry generalization mechanisms). The overall recommendation is to make available multiple representations of geometry data, and to give data consumers the ability to identify those most fit for purpose. A variety of mechanisms can be used to achieve this, as publishing different geometry representations at different URIs, and accompanying them with a human- and/or machine-readable description of their characteristics (e.g., format, spatial resolution, scale, level of generalization). However, the lack of common practices in this area makes it difficult to provide consistent guidelines on how to publish and access different geometry representations.
A standardized way of requesting a geometry in a different CRS is described inOGCAPI Features part 2: Coordinate Reference Systems by Reference [OAF2]. This is done using parameters, which are defined in the standard, and with aContent-Crs
response header to tell the client which CRS was used with the geometries in the response.
On a more general level, content negotiation (as recommended inDWBP Best Practice 19: Use content negotiation for serving data available in multiple formats) could be a way to deal with requesting different representations of geometries - be it different data formats, CRSs, levels of accuracy, etc. Content negotiation could be expanded to enable its use for choosing a 'profile' concerning the semantics and structure of the data, such as a data vocabulary. TheDataset Exchange WG (DXWG) aspires to provide a REC for "content negotiation by profile" (see also [RFC6906] "profile" Link Relation Type).
Although a large amount ofspatial data has been published on the Web, so far, there are few authoritative datasets containing geometrical descriptions of their boundaries. Their number is growing (e.g. at the time of writing there are three authoritative spatial datasets publicly available aslinked data in the Netherlands containing topographic, cadastral, and address data), but currently there is no common practice in the sense of the same spatial vocabulary being used by most spatial data publishers. Direct georeferencing of data implies representing coordinates orgeometries and associating them to aCRS. This requires vocabularies for geometries and CRSs. The consequence is the lack of a baseline during the mapping process for application developers trying to consume specific incoming data. Datasets describing administrative units, points of interest or postal addresses with their labels and geometries, and identifying theseSpatial Things with URIs could be beneficial not only for georeferencing other datasets, but also for interlinking datasets georeferenced by direct and indirect location information.
Currently, no single standardized vocabulary is available that covers all needs. Version 1.0 of the [GeoSPARQL] vocabulary is too limited to provide a good basis, but work to update the [GeoSPARQL] spatial ontology is underway. The first iteration of this update includes the addition of common classes for spatial object collections, and common properties for spatial object size, centroid, bounding box, etc. A companion CRS ontology is also on the way. This work will provide an agreed spatial ontology, i.e. a bridge or common ground between geographical and non-geographicalspatial data and betweenW3C andOGC standards; conformant to the [ISO-19107] abstract model and aligned to existing available ontologies such as [GeoSPARQL] 1.0, the [W3C-BASIC-GEO] vocabulary, [NeoGeo] and the ISA Programme Location Core Vocabulary [LOCN]. Still, as GeoSPARQL 1.1 Annex E describes, at least 15 different vocabularies to encode geometries on the web exist. While this annex provides a much needed way to relate and interlink data in these different vocabularies, the adoption of an improved GeoSPARQL vocabulary will take a significant amount of time.
The ideal vocabulary would define basic semantics for the concept of a reference system for spatial coordinates, a basic datatype, or basic datatypes forgeometry, how geometry and real world objects are related and how different versions ofgeometries for a single real world object can be distinguished. For example, it makes sense to publish different geometric representations of a spatial object that can be used for different purposes. The same object could be modelled as a point, a 2D polygon or a 3D polygon. The polygons could have different versions with different resolutions (generalization levels). And all those different geometries could be published with differentcoordinate reference systems. Thus, the vocabulary would provide a foundation for harmonization of the many different geometry encodings that exist today.
Finally, a spatial data vocabulary would need to be validatable. For Semantic Web Standards, this is usually achieved by defining SHACL shapes which can validate the graph structure defined by the spatial vocabulary and its valid datatypes. For a complete validation, also the contents of geometry literals need to be considered, which is at the time of writing not possible using SHACL alone and requires the usage of GIS software libraries.
Even if allspatial data should become findable directly through search engines, data portals would still remain important hubs for data discovery — for example, because the metadata records registered there can be made crawlable. But in addition, different data portals can harvest each other's information provided there is consistency in the types and meaning of included information, even if structures and technologies vary. In the eGovernment sector, [VOCAB-DCAT-2] is a standard for dataset metadata publication and harvesting implemented by these portals. Version 1 of DCAT, and therefore its European Union profile, was not good at describing spatial datasets, so [GeoDCAT-AP] extended the EU application profile forspatial data.Some of the ideas of GeoDCAT-AP have been adopted in DCAT2, and GeoDCAT-AP has been updated accordingly. [GeoDCAT-AP-20201223] is mentioned in the "Possible approach" section of several Best Practices in this document. It still adds some properties beyond DCAT2 that are useful in certain cases.
With the goal of sharing spatial metadata, [GeoDCAT-AP-20201223] definedRDF bindings covering the core profile of [ISO-19115] and the INSPIRE metadata schema [INSPIRE-MD], enabling the harmonizedRDF representation of existing spatial metadata. The reason of this choice was to focus first on the most used metadata elements, whereas additional mappings could be defined in future versions of the specification, based on users’ and implementation feedback.
The next step is an evolution towards a single standard for metadata as it is used in data portals without loss of relevant metadata while still understandable and not too complicated. A working group in the Open Geospatial Consortium is currently working on a standardized WebAPI,OGCAPI Records, for metadata publication and discovery. It offers the capability to create, modify, and query metadata on the Web by providing a simple, extendable record schema for describing datasets and other resources, a way to organize records in collections, and anAPI to access and interact with these collections. When finished, this standard will support the publication of metadata catalogs in conformance to the best practices in this document.
Large and complex datasets, for example, data gathered using automated sensors, may be impossible to download in their entirety due to their dynamic nature and potential volumes. It is therefore necessary in these cases to be able to adequately describe the structure of such data and how services interact to expose subsets of it — even individual records. Currently, there is no established Best Practice for dealing with this, especially when taking the spatial and temporal dimensions into account.
Several approaches and standards have been recently developed:
QB4ST is an extension to RDF Data Cube to provide mechanisms for defining spatio-temporal aspects of dimension and measure descriptions. It is intended to enable the development of semantic descriptions of specific spatio-temporal data elements by appropriate communities of interest, rather than to enumerate a static list of such definitions. It provides a minimal ontology of spatio-temporal properties and defines abstract classes for data cube components (i.e. dimensions and measures) that use these, to allow classification and discovery of specialized component definitions using general terms.
QB4ST is designed to support the publication of consistently described re-usable and comparable definitions of spatial and temporal data elements by appropriate communities of practice. One obvious such case is the use of GPS coordinates described as decimallatitude andlongitude measures. Another example is the intended publication of a register ofDiscrete Global Grid Systems (DGGS) by theOGC DGGS Working Group. QB4ST is intended to support publication of descriptions of such data using a common set of attributes that can be attached to a property description (extending the available RDF-QB mechanisms for attributes of observations).
Spatial data is often concerned with measurements (distance, angles etc.) — for example, when specifying the position of a feature according to aCoordinate Reference System or the accuracy of that position.
For measurement values to be correctly interpreted, aunit of measurement must also be specified. The challenge here is specifying units of measurement in a way that can be widely understood.
As humans, we’re usually quite good at guessing. For example, given a discussion about the accuracy of a position, the assertion±3.1 m
probably means 3.1meters. That seems reasonable — but it might also be 3.1miles. Unfortunately, software systems mostly lack the human ability to guess. So we need to unambiguously express which unit of measure is being used — and this is where the problems exist.
There are essentially two mechanisms that can be used:
Use a named serialization scheme that provides string-literal notation for both base units and derived units. Given that there are an infinite number of derived units, such a serialization should specify a formal grammar that software applications use to interpret those strings; enabling automated conversion between units and other useful functions like verifying that two measured quantities can be combined based on the dimensionality of those measurements (e.g. you can’t combine a length with an area and get a sensible answer!).
Use a URI; such as those provided by Quantities, Units, Dimensions and Data Types Ontologies (QUDT) andOntology of units of Measure (OM). For example, the unit of measuremeter has the URIs
(QUDT) and
Earlier versions of [GML] required that every unit was specified using a URI. But, in practice, many were using symbols like "m
" instead of a URI anyway, as they are shorter and often better understood. As a result, [GML]clause MeasureType, UomIdentifier now allows theUnified Code for Units of Measure (UCUM) unit of measure serialization in addition to URIs.
If you choose to use a serialization scheme for expressing units of measure, you should select one that is well-known among your community of users.
It’s also worth noting that, if your format or vocabulary allows, you should include a human readable label. For the simple case of displaying the data on a Web page, this removes the need to look up this information from the serialization scheme specification or vocabulary.
The trouble with the use of serialization schemes is that we can’t assume client applications understand the notation. We need some mechanism to indicate which serialization is being used — either so that application developers can find the specification and source some software (e.g. theucum.js library) to process the unit strings, or so that the client application can map the notation to a well-known URI whose definition conforms to a data model that the application can understand.
There is no evidence of best practice here — nor is there consensus on which data model is best for describing units of measure. Possible approaches to identify the serialization scheme used include:
Provide this information in the data itself, e.g.:
{"@context": {"type":"@type","value":"@value","rdf":"","qudt":"","skos":"","measurement":"","unit":"qudt:unit","label": {"@id":"skos:prefLabel","@container":"@language"},"symbol":"qudt:symbol","UCUM":"" },"measurement":3.1,"unit": {"label": {"en":"meters" },"symbol": {"type":"UCUM","value":"m" } }}
For other examples, seeBest Practice 16: Describe the positional accuracy of spatial data.
Provide this information in the description of theAPI that provides access to your data; see [DWBP]Best Practice 25: Provide complete documentation for yourAPI.
Convey this information in the HTTP response headers; e.g. using theprofile Link Relation Type [RFC6906].
In summary, if you think that your users will need to support automated processing of units of measure, then, in lieu of widespread best practice, it will likely be worth engaging with your user community to determine how best to meet their needs.
Looking to the future, theW3C Web of Things Interest Group has created a task force to address the challenges of semantic interoperability relating to units of measure, which may lead to emergence of best practices can be adopted by spatial data publishers.
Unlike administrative areas and other topographic features that have clearly defined boundaries, places often have ill-defined, fuzzy boundaries that are based on human perception of ‘place’; you can’t always define a boundary for a place. For example,Edinburgh (osuk:4000000074558316) thenamed place, published byOrdnance Survey, is described using only a notional pointgeometry; information is not provided about the geometricextent. Other examples of places with ill-defined, fuzzygeometries includeThe Sahara, theAmerican West andRenaissance Italy. The relationships between places, with their ill-defined (or even absent) geometrical extents, defy description using the topological relationships which are computed mathematically from geometry.
Given the lack of existing best practice, we propose the use of aqualitative assertion based on human perceptions to relate places that are deemed to be the same:samePlaceAs.
Given that the notion ofplace concerns a social perspective, we consider it to be distinct fromlocation which is based ongeometry. As a result,samePlaceAs
can be used to assert the imprecise, social perceptions about the equality of places.samePlaceAs
does not overlap with the topological relationships described later in this best practice document that can be computed from geometry.
As with all assertions of an imprecise nature that lack formal semantics,samePlaceAs
may have limited value for semantic reasoning. Exactly what constitutes the ‘same place’ will always be somewhat debatable. For example, isancient Byzantium the same place asmodern Istanbul? Is a historical hotel that was moved across the street to save it from demolition in a redevelopment scheme that same place that it used to be?
[SCHEMA-ORG] would be a good home for this link relation type. The definition would be something as follows:
Used to relate two places that are perceived to be the same; the physicalextent of the two places should be broadly comparable but do not need to be equal in a topological or geometric sense.
Values expected to be one of these types:
Used on these types:
However, the current definition ofschema:Place
is a little too general:
Entities that have a somewhat fixed, physical extension.
This definition includesanything with spatial extent (i.e. allSpatial Things); we would consider "my car keys" to be a Spatial Thing, but not a place.
Links by their nature are a directional relationship between source and target. Most often, a link is published within the dataset that describes the source resource specified in the link, enabling users to browse through information; traversing the links they find in documents (i.e.outbound links). While many links specify source and target resources that are described in the same dataset, it is commonplace, and encouraged as per the 5★ rating, to linkbetween datasets, thereby 'stitching' together the Web of data. In these situations, a link refers to some remote target resource. A dataset accessAPI may interpret such links to enable a user to specify a well-known URI of aSpatial Thing they are interested in (for example from popular data repositories such asGeoNames,Wikidata orDBpedia) in order to search for related information (seeBest Practice 13: Expose spatial data through 'convenienceAPIs' for more onAPIs and search). But how does a user know of the existence of a link referringto their target Spatial Thing (and potentially identifying a related resource that is useful for their intended goal) when that link is published within a remote dataset resource (i.e. aninbound link)?
Making these inboundlinks discoverable makes the Web of data symmetric; e.g. where both inbound and outbound links are visible to data users who may then choose to traverse them. However, links provide a secondary benefit in terms of a citation; indicating some subjective trustworthiness of the data (e.g. "it's good enough quality for me to use"). Not only do such link-based citations convey the value of the dataset in a way that the original publisher can objectively quantify (and hence continue to publish and maintain those datasets), but they can provide a subjective indication of quality; like a search engine’s page ranking algorithm, the larger the number of sources of inbound links, the greater the likelihood that a given dataset is of high quality.
Search engines play an important role in the Web ecosystem; ifspatial data is published in a way that is indexable by search engines (seeBest Practice 2: Make your spatial data indexable by search engines) then it should be possible to find the relationships betweenSpatial Things. However, the internals of search engines are opaque to most users, and the necessary query-patterns may not be offered.
An alternative is to publish or harvestlinks into a common repository that can be queried. For example, the<sameAs> service exposes a collection of links defined using theowl:sameAs
relation type that can be queried to find related resources. As an illustration, the HTTP GET request<>
finds four matches, all of which identify Anne Frank's House. The problem with this ad-hoc approach is that a client application would need to be configured to query an arbitrary number of known service end-points to discoverlinks published across all the domains deemed of interest. As such, this kind of approach is only likely to be useful where one can alert the user community which services they should refer to.
Publishing summary information about datasets and thelinks defined in them using the Vocabulary of Interlinked Datasets [VoID] may provide a workable approach — but is yet to be widely adopted.
In [VoID],Linksets provide summary description of the relationships between two datasets; identifying the source and target datasets, the link relation type(s) used plus optional metadata such as URI templates for identifying participating resources and the number oflinks specified for each relation type, and may describetechnical features such asAPIs through which the participating datasets can be accessed.
Applications could be configured to harvest [VoID] descriptions from data publishers in their community, enabling them to build a searchable graph of relationships between datasets — aData Network. Because this information is summarized at the set level, the data network graph is convenient to work with, allowing simple discovery of numbers and relation types of bothinbound andoutboundlinks within a given dataset. Once the presence of interesting links within a dataset has been identified, the user would then work directly with the dataset in question (or theAPI through which it is accessed) to acquire the detailed information about specific links defined in that dataset.
Interactions such as those described above would be quite intensive for a human using a browser. However, the [VoID] descriptions could be used to drive a software agent that hides much of the complexity from the user; for example, automatically harvesting individuallinks once a set-level relationship is considered interesting, and then allowing the user to traverse those links either forwards or backwards.
As the use of [VoID] becomes more widespread, best practices regarding its use in building a searchabledata network may emerge.
This section gives two tables that aim to be helpful in selecting the right spatial data encoding in a given situation. There is not one most appropriate format: which format is best may depend on many things. The first table gives an overview of common spatial data formats; the second, an overview of common spatial dataRDF [RDF11-PRIMER] vocabularies.
The first table is a matrix of the common formats, showing in general terms how well these formats help achieve goals such as discoverability, granularity etc.
Please note that all the listed formats are open and text-based.
WKT | GML | KML | GeoJSON | HTML | |
Based on | WKT | XML | XML | JSON | HTML |
Media type | text/plain | application/gml+xml | application/ ,application/ | application/geo+json | text/html |
Usage | Representation of 0D-2D geometries, CRS and CRS transformation | Representation of Spatial Things and 0D-3D geometries. Comprehensive and supporting many use cases. | Representation of Spatial Things and 0D-3D geometries. Main focus on spatial data visualization and interaction | Representation of Spatial Things and 0D-2D geometries | Description of Spatial Things and geometries can be embedded by using mechanisms as [HTML-RDFa], [MICRODATA], [JSON-LD], using vocabularies as [SCHEMA-ORG] |
Tool support | Widely supported inGIS tools Supported by some Web libraries, usually converted in GeoJSON [RFC7946] Supported by mosttriple stores | Widely supported inGIS tools Supported by some Web libraries, usually converted in GeoJSON [RFC7946], but not when thegeometry is 3-dimensional (volumes) Supported only bytriple stores supporting [GeoSPARQL] | Mainly supported by Earth browsers, as Google Earth | Supported in someGIS tools Widely supported in Web libraries and mappingAPIs | Optimal for Web publication and discovery |
Web discoverability | Low | Low | Low | Low | Good |
Link support | No | Via [XLINK11] | Via [XLINK11] | No | Yes |
Geometry specification | |||||
CRS support | Depends on the flavor — e.g.,EWKT and [GeoSPARQL]'s WKT support arbitrary CRSs, and the latter defaults to WGS 84 long/lat (CRS84) | Any, and it can be explicitly specified (via attribute@srsName ) | WGS 84 long/lat (CRS84) only | WGS 84 long/lat (CRS84) only | Depends on the vocabulary used — e.g., [SCHEMA-ORG] supports WGS 84 only |
Axis order support | Any, but it cannot be explicitly specified — e.g., in [SIMPLE-FEATURES]'sWKT andEWKT it defaults to longitude/latitude, whereas in [GeoSPARQL]'s WKT it is determined by the CRS used | Determined by the CRS used | Longitude / latitude only, with optional altitude | Longitude / latitude only, with optional altitude | Depends on the vocabulary used — e.g., [SCHEMA-ORG] supports lat/long only |
3D support | No | Yes | Yes | No | Depends on the vocabulary used — e.g., [SCHEMA-ORG] does not support 3D geometries |
Formats such asGRIB,HDF andnetCDF are used pervasively throughout the sciences to encodespatial data. However, working with data in these formats requires specialist software which is not typically available within a Web browser, thereby driving people to download the data to use offline. Technically, there is nothing that stops one from writing, say, aGRIB decoder in JavaScript — but none exist today.
All the formats listed in the table above are easy to work with directly in a Web browser, because they share characteristics such as:
The ability to work with scientific spatial data in a Web browser was the driving motivation for [COVJSON-OVERVIEW].OGC has now adopted CoverageJSOn as a formal Community Standard.
The following table compares common spatial data vocabularies and what you can do with them.
Additional vocabularies can be discovered fromLinked Open Vocabularies (LOV); using search terms like 'location' and 'place', or tagsGeography,Geometry andTime.
Description | Includes terms for describing location and temporal information, as classesdcterms:Location ,dcterms:PeriodOfTime , and propertiesdcterms:spatial ,dcterms:temporal , anddcterms:coverage . | A widely used vocabulary, although not an official standard, for specifying point coordinates in the WGS 84datum. | Includes terms for describingpostal addresses and0D geometries (points). | Vocabulary defined by theW3C Geospatial Incubator Group (GeoXG) for the representation of geospatial properties of Web resources. On 28 March 2017, [GeoRSS] has been proposed as acandidateOGC Community Standard. | Designed for annotating Web pages with machine-readable metadata, it supports a number of classes and properties for specifying location information, includinggeometries. SeeBest Practice 2: Make your spatial data indexable by search engines for more information. | OfficialOGC standard, defining a set of terms and functions for modeling and querying spatial information. Coordinates are encoded by usingWKT or [GML]. | Defines a set of general terms for describing location information that can be extended based on domain-specific requirements. Covers geographical names,geometries, and postal addresses. | Reuses [DCTERMS]. [OWL-TIME], and [LOCN] for describing location and temporal information, and it defines additional terms — namely,dcat:centroid ,dcat:bbox ,dcat:startDate ,dcat:endDate ,dcat:spatialResolutionInMeters ,dcat:temporalResolution . |
Spatial things | dcterms:Location | w3cgeo:SpatialThing | vcard:Kind , and its subclasses;vcard:Address | georss:_Feature is placeholder forSpatial Thing | schema:Place , and its subclasses;schema:PostalAddress | geosparql:Feature | dcterms:Location ,locn:Address | dcterms:Location |
Properties to associateSpatial Things withgeometries | - | w3cgeo:location ,w3cgeo:lat_long ,w3cgeo:lat ,w3cgeo:long ,w3cgeo:alt | vcard:hasGeo | georss:where ,georss:point ,georss:line ,georss:polygon ,georss:box | schema:geo | geosparql:hasGeometry ,geosparql:defaultGeometry | locn:geometry | locn:geometry ,dcat:centroid ,dcat:bbox |
Geometries | - | w3cgeo:Point (subclass ofw3cgeo:SpatialThing ) | Geometries are represented with thegeo URI scheme [RFC5870] | Geometries are represented with a literal encoding of point coordinates | geosparql:Geometry , and its subclasses (sf:Point ,sf:Polygon , etc.) | locn:Geometry (it denotes either a structured object or a literal) | ||
Geometry specification | ||||||||
CRS support | - | WGS 84 only | WGS 84 only | WGS 84 only | WGS 84 only | Any | Any (depends on how the geometry is represented) | Any (depends on how the geometry is represented) |
Axis order support | - | lat/long only | lat/long only | lat/long only | lat/long only | Determined by the CRS used | Any (depends on how the geometry is represented) | Any (depends on how the geometry is represented) |
0D support | - | lat/long coordinate pair (w3cgeo:lat_long ), decimal degrees (w3cgeo:lat ,w3cgeo:long ), decimal meters (w3cgeo:alt ) | geo URI scheme [RFC5870] | lat/long coordinate pair | lat/long coordinate pair | [GML],WKT | [GML],WKT, GeoJSON [RFC7946],geo URI scheme [RFC5870],Geohash | [GML],WKT, GeoJSON [RFC7946] |
1D and 2D support | - | - | - | lat/long coordinate pairs, separated by a comma | lat/long coordinate pairs, separated by a comma or a space | [GML],WKT | [GML],WKT, GeoJSON [RFC7946] | [GML],WKT, GeoJSON [RFC7946] |
3D support | - | - | - | - | - | [GML] | [GML] | [GML] |
This section is non-normative.
The list below describes the main benefits of applying the Spatial Data on the Web Best Practice. The benefits are identical to those defined in [DWBP]. Each benefit represents an improvement in the way how spatial datasets are available on the Web.
The following table relates Best Practices and Benefits.
The figure below shows the benefits that data publishers will gain with adoption of the Best Practices.
All Best Practices
The list below illustrates how the requirements defined in [SDW-UCR] are met by a combination of the best practices defined in this document (Spatial Data Best Practices) and those defined in [DWBP] (General Data Best Practices).
The FAIR Principles are described atFAIR Principles - GO FAIR
"The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing."
"The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
The Research Data Alliance'sFAIR Data Maturity Model. Specification and Guidelines usefully tries to define precise criteria to assess compliance with the FAIR principles.
Web Accessibility is a well-established and vital domain of activity involving Web standards implementers, Web developers, and users. The range of use cases of maps and spatial data on the Web is almost as diverse as humanity itself, and it is consequently incumbent upon spatial data on the Web providers to facilitate consumption of spatial information by persons of all abilities. In this regard, the FAIR principles, and in particular the “A” in FAIR, are deficient. Natural Resources Canada has suggested that “FAIR+” denote an extension of FAIR with the goal of supporting human accessibility to spatial content. FAIR+ importantly includes users with disabilities, but also includes developers and users who belong totypically under-represented or disadvantaged communities.
Search engine crawlers navigate the web using structured HTML to look for human readable information to index. The more “human readable” your content is, the more likely it is to be findable, and ranked highly by web searches – for humans or machines. TheMaps for HTML Community Group is followingWeb Accessibility Initiative standards to extend the structure of HTML to include maps and spatial information, so that not only can maps be made accessible for humans, but so that spatial things may be spatially indexed, ranked and found.
No matter how easy it is to find, access, and even use your data, it is of little use unless it is of sufficient quality for the user’s task. However, what is “good” for one task is not necessarily “good” for another.
There are a variety of approaches in use to try to match users with data that will be useful to them. These range from telling the user a lot about the quality of the data to telling them what you (& others) have successfully used it for.
SeeDWBP Best Practice 6: Provide data quality information. This often includesDWBP Best Practice 21: Provide data up to date
Absolute positional accuracy: The closeness of reported coordinate values to values accepted as or being true [ISO-19159-2].
Axis order: The order in which coordinates are presented. For example, some systems use (latitude, longitude) rather than (longitude, latitude). The latter is more similar to the mathematical convention of (x,y) ordering. The order used may differ from the order used to define the coordinate system.
Coordinate Reference System (CRS): A coordinate system to locate entities of interest with respect to an object using adatum [ISO-19111]. If the entities of interest and the object and datum are in the real world, the CRS is aSpatial Reference System (SRS). If the object is the Earth, the SRS is aGeo-Spatial Reference System (GRS). A GRS may be local, regional or global in scope. An example of a CRS that is not a SRS is the wavelength of a signal in the electromagnetic spectrum.
Coverage: A coverage is a function that describe characteristics of real-world phenomena that vary over space and/or time. Typical examples are temperature, elevation and precipitation. A coverage is typically represented as a data structure containing a set of such values, each associated with one of the positions in a spatial, temporal or spatiotemporal domain. Typical spatial domains are point sets (e.g. sensor locations), curve sets (e.g. contour lines), grids (e.g. orthoimages, elevation models), etc. A property whose value varies as a function of time may be represented as a temporal coverage or time-series [ISO-19109].
Comma Separate Values (CSV): A file format for tabular data that writes each row on a separate line and each cell is separated from the next with a comma; see [RFC4180]. CSV is just one variety of tabular data; for more information refer to [TABULAR-DATA-PRIMER].
Datum: Parameter or set of parameters that define the position of the origin, the scale, and the orientation of a coordinate system [ISO-19111].
Dimension (geometry): In physics and mathematics, the dimension of a mathematical space (or object) is informally defined (seeWikipedia entry) as the minimum number of coordinates needed to specify any point within it. Thus, a point has no dimension (0D) as there is no inside, whereas a line has a dimension of one (1D) because only one coordinate is needed to specify a point along it – for example, the point at 5 on a number line. A surface such as a plane or the surface of a cylinder, torus or sphere has a dimension of two (2D) because two coordinates are needed to specify a point on it – for example, both alatitude andlongitude are required to locate a point on the surface of a sphere. The inside of a cube, cylinder, torus or sphere is three-dimensional (3D) because three coordinates are needed to locate a point within these spaces. For a formal rigorous mathematical definition see the ISO definition [ISO-19107].
Discrete Global Grid System: A DGGS is a form of Earth reference that, unlike its established counterpart thecoordinate reference system that represents the Earth as a continual lattice of points, represents the Earth with a tessellation of nested cells. Generally, a DGGS will exhaustively partition the globe in closely packed hierarchical tessellations, each cell representing a homogenous value, with a unique identifier or indexing that allows for linear ordering, parent-child operations, and nearest neighbour algebraic operations.
Ellipsoid: An ellipsoid is a closed quadric surface that is a three-dimensional analogue of an ellipse. Ingeodesy, areference ellipsoid is a mathematically defined surface that approximates thegeoid.
Extensible Markup Language (XML): A simple, very flexible text-based markup language derived from SGML (ISO 8879). It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable [XML11].
Extent: The area covered by something. Within this document, we always imply spatial extent; e.g. size or shape that may be expresses using coordinates.
Feature: Abstraction of real world phenomena. A digital representation of a real-world entity or an abstraction of the real world. Examples of features include almost anything that can be placed in time and space, including desks, buildings, cities, trees, forest stands, ecosystems, delivery vehicles, snow removal routes, oil wells, oil pipelines, oil spill, and so on. The terms feature and object are often used synonymously [ISO-19101-1-2014].
Geocoding: Forward geocoding, often just referred to as geocoding, is the process of converting addresses into geographic coordinates. Reverse geocoding is the opposite process; converting geographic coordinates to addresses. See also the ISO definition [ISO-19133].
Geographic information (also geospatial data): Information concerning phenomena implicitly or explicitly associated with a location relative to the Earth. [ISO-19101-1-2014].
Geographic information system (GIS): An information system dealing with information concerning phenomena associated with locations relative to the Earth. [ISO-19101-1-2014].
Geohash: A specificgeocoding system with a hierarchical spatial data structure which subdivides space into nested regions. Geohashes and some other geocoding systems offer properties like arbitrary precision and the possibility of repeatedly truncating characters from the end of the code to reduce its size and precision. As a consequence of the gradual precision degradation, nearby places will often (but not always) present similar prefixes. The longer a shared prefix is, the closer the two places are. Coordinate and address systems generally do not have this property. (Source:wikipedia).
Geoid: An equipotential surface where the gravitational field of the Earth has the same value at all locations. This surface is perpendicular to a plumb line at all points on the Earth's surface and is roughly equivalent to the mean sea level excluding the effects of winds and permanent currents such as the Gulf Stream.
Geometry: An ordered set ofn-dimensional points in a givencoordinate reference system; can be used to model the spatialextent or shape of aSpatial Thing.
Internet of Things (IoT): The network of physical objects or "things" embedded with electronics, software, sensors, and network connectivity, which enables these objects to be controlled remotely and to collect and exchange data.
JavaScript Object Notation (JSON): A lightweight, text-based, language-independent data interchange format defined in [RFC7159]. It was derived from the ECMAScript Programming Language Standard. JSON defines a small set of formatting rules for the portable representation of structured data.
Latitude: The angular distance north or south of the equator. Often abbreviated toLat.
Link: A typed connection between two resources that are identified by Internationalized Resource Identifiers (IRIs) [RFC3987], and is comprised of: (i) a context IRI, (ii) a link relation type, (iii) a target IRI, and (iv) optionally, target attributes. Note that in the common case, the IRI will also be a URI [RFC3986], because many protocols (such as HTTP) do not support dereferencing IRIs [RFC5988].
Linked data: The term ‘Linked Data’ refers to an approach to publishing data that puts linking at the heart of the notion of data, and uses the linking technologies provided by the Web to enable the weaving of a global distributed database [LDP-PRIMER].
Longitude: The angular distance east or west of the prime meridian. Often abbreviated toLong.
Map Projection: A coordinate conversion from an ellipsoidal coordinate system to a plane, e.g. Transverse Mercator.
OGCAPI - Features: A set of resource-orientedAPI building blocks for creating, modifying, and querying geographical features. There are several other consistentAPIs published or under development to create a suite of geospatial ‘building blocks’.
Open-world assumption (OWA): In a formal system of logic used for knowledge representation, the open-world assumption asserts that the truth value of a statement may be true irrespective of whether or not it is known to be true. This assumption codifies the informal notion that in general no single agent or observer has complete knowledge. In essence, from the absence of a statement alone, a deductive reasoner cannot (and must not) infer that the statement is false. That is, a valid response to a logical query may be: true, false or unknown.
Projected Coordinate Reference System: Acoordinate reference system derived from a two-dimensional geodetic coordinate reference system by applying a map projection.
Resource Description Framework (RDF): A directed, labeled graph data model for representing information in the Web. It may be serialized in several data formats such as N-Triples [N-TRIPLES], XML [RDF-SYNTAX-GRAMMAR], Terse Triple Language (“turtle” or TTL) [TURTLE] and [JSON-LD].
Semantic Web: The term “Semantic Web” refers to World Wide Web Consortium's vision of the Web oflinked data. Semantic Web technologies enable people to create data stores on the Web, build vocabularies, and write rules for handling data.
SensorThingsAPI: An open, geospatial-enabled and unified way to interconnect theInternet of Things (IoT) devices, data, and applications over the Web. [SENSORTHINGS].
SPARQL: A query language forRDF; it can be used to express queries across diverse data sources [SPARQL11-OVERVIEW].
Spatial data: Data describing anything with spatialextent; i.e. size, shape or position. In addition to describing things that are positioned relative to the Earth (also seegeospatial data), spatial data may also describe things using other coordinate systems that are not related to position on the Earth, such as the size, shape and positions of cellular and sub-cellularSpatial Things described using the 2D or 3D Cartesian coordinate system of a specific tissue sample.
Spatial Data Infrastructure (SDI): An ecosystem of geographic data, metadata, tools, applications, policies and users that are necessary to acquire, process, distribute, use, maintain, and preservespatial data. Due to its nature (size, cost, number of interactors) anSDI is often government-related.
Spatial operator, spatial query function: Function or procedure that has at least one spatial parameter in its domain or range [ISO-19107].
Spatial relation, spatial relationship: Specifies how aSpatial Thing is located in space in relation to another Spatial Thing. Typically determined using aspatial operator.
Spatial thing: Anything with spatialextent, (i.e. size, shape, or position) and is a combination of the real-world phenomenon and its abstraction (thefeature). Examples are: people, places, or bowling balls.
This is different from the [ISO-19107] definition of aSpatial Object which is ageometry or a topology object.
Triple-store (or quadstore): A triple-store orRDF store is a purpose-built database for the storage and retrieval of RDF subject-predicate-object “triples” through semantic queries. Many implementations are actually “quad-stores” as they also hold the name of the graph within which a triple is stored.
Universe of discourse: view of the real or hypothetical world that includes everything of interest [ISO-19101-1-2014].
Web Feature Service (WFS): A standardized HTTP interface allowing requests for geographicalfeatures across the Web using platform-independent calls. [WFS].
Web Map Tile Service (WMTS): A standardized HTTP interface for requesting tiled, geo-referenced map images from one or more distributed spatial databases. [WMTS]
Well Known Text (WKT): A text mark-up language for representing vectorgeometry objects on a map, spatial reference systems of spatial objects and transformations between spatial reference systems. (Sources: [ISO-19162], [SIMPLE-FEATURES],Wikipedia entry).
The editors gratefully acknowledge the contributions made to this document byall members of the working group; especially, the contributions received from those listed in the Contributors list.
This document would not have been possible without the tremendous efforts of the Data on the Web Working Group; their [DWBP] provides the essential underpinnings for our own work. Special thanks are due to Newton Calegari, Riccardo Albertoni, Annette Grainer, Antoine Isaac, and Eric Stephan.
The editors are also grateful for comments received from Ig Ibert Bittencourt, Marco Brattinga, Martin Desruisseaux, Neil McNaughton, Simeon Nedkov, James Passmore, Stefan Proell, Maik Riechert and Erik Wilde.
The editors also gratefully acknowledge the chairs of this Working Group: Ed Parsons and Kerry Taylor — and staff contacts Phil Archer and François Daoust.
A full change-log is available onGitHub
The document has been updated to take into account further support for spatial and temporal aspects added in the new version of theW3C Data Catalog Vocabulary (DCAT) [VOCAB-DCAT-2] and GeoDCAT-AP [GeoDCAT-AP-20201223].
In particular:
has been replaced with the more specific propertydcat:bbox
, defined in [VOCAB-DCAT-2]. Moreover, the original GeoJSON datatype URI used in the example (corresponding tothe GeoJSON IANA Media Type URL) has been replaced withgeosparql:geoJSONLiteral
, included in the draft of the new version of [GeoSPARQL] (see issuesopengeospatial/ogc-geosparql/issues/1 andopengeospatial/ogc-geosparql/issues/48), and already adopted in [GeoDCAT-AP-20201223].dcat:centroid
) and bounding boxes (dcat:bbox
, which are available from the reference [VOCAB-DQV] examples inthe relevant section.Additional changes concerns editorial fixes (typos, broken links, and styling). This included a fix to issue#1037, to ensure all examples be numbered. As a result, the numbering of examples changed.
No major changes have been introduced since publication on 11 May 2017. Main updates were made in response to public reviews to clarify that the best practices do not cover advanced scenarios, e.g. involving critical decision making, in section3.1 Spatial data, and to note the absence of scientific formats to encode spatial data on the Web in sectionA. Applicability of common formats to implementation of best practices.
The most obvious change to readers is that the best practices have been reordered with the intent to improve the readability of the document, and the empty stubs of best practices removed in the previous release are now gone. The fragment-identifiers for the best practices remain unchanged, but the numbers are different. The mapping (from old number to new) is as follows:
Two new sections and two new best practices were added:
Section11. How to use these best practices (link to previous WD version) has been removed.
Significant updates to the following best practices:
Most of the other best practices received minor additions and improvements, without significant change to their contents.
Content was added to the How to test and Benefits sections of all Best Practices.
TheConclusions section was renamed "Gaps in current practice" and content added.
SectionA. Applicability of common formats to implementation of best practices was updated; it now has one table listing spatial data formats and one listing spatial data vocabularies.
SectionC. Cross reference of use case requirements against best practices was expanded to include cross-reference from both this document and [DWBP].
TheGlossary was updated.
Plus minor, mostly editorial changes.
Significant updates to the following best practices:
The following best practices have been removed or merged into other best practices:
Section 14. Narrative — the Nieuwhaven flooding (link to previous WD version) has been removed.
Appendix B: Authoritative sources of geographic identifiers has been merged intoBest Practice 14: Publish links between Spatial Things and related resources.
Significant updates to:
Significant updates to:
(further updates to these best practices are expected in the next WD release, circa end January 2017)
Plus minor changes that include adding a list of most important best practices for data publishers that start from an existingSDI tosection 9, and changing of a few best practice titles to include the wordspatial.
The document has undergone substantial changes since thefirst public working draft. Below are some of the changes made:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in:
Referenced in: