Working with geospatial data

Geospatial analytics let you analyze geographic data inBigQuery. Geographic data is also known asgeospatial data.

Common types of objects when working with geospatial data include the following:

  • Ageometry represents a surface area on the Earth. It is often describedusing points, lines, polygons, or a collection of points, lines, and polygons.Ageometry collection is a geometry that represents the spatial union of allshapes in the collection.
  • Aspatial feature represents a logical spatial object. It combines ageometry with additional attributes that are application-specific.
  • Aspatial feature collection is a set of spatial features.

In BigQuery, theGEOGRAPHYdata type represents a geometry value or geometry collection. To representspatial features, create a table with aGEOGRAPHY column for the geometry plusadditional columns for the attributes. Each row of the table is a spatialfeature, and the entire table represents a spatial feature collection.

TheGEOGRAPHY data type describes apoint set on the Earth's surface. Apoint set is a set of points, lines, and polygons on theWGS84 reference spheroid, with geodesic edges. You can use theGEOGRAPHY data typeby calling one of the GoogleSQLgeography functions.

Loading geospatial data

Single points on Earth can be described by just a longitude, latitude pair.For example, you can load a CSV file that contains longitude and latitude valuesand then use theST_GEOGPOINTfunction to convert them intoGEOGRAPHY values.

For more complex geographies, you can load the following geospatial data formatsinto aGEOGRAPHY column:

  • Well-known text (WKT)
  • Well-known binary (WKB)
  • GeoJSON
  • GeoParquet

Loading WKT or WKB data

WKT is atext format for describing individual geometry shapes using points, lines,polygons with optional holes, or a collection of points, lines, or polygons. WKBis the binary version of the WKT format. WKB can be hex encoded for formats thatdon't support binary data, like JSON.

For example, the following defines a point in WKT:

POINT(-121 41)

To describe a spatial feature, WKT is usually embedded in a container fileformat, such as a CSV file, or in a database table. A file row or a table rowusually corresponds to the spatial feature. The whole file or the whole tablecorresponds to the feature collection. To load WKT data intoBigQuery, provide aschema thatspecifies aGEOGRAPHY column for the geospatial data.

Note: You can't use schema auto-detection to load WKT data as aGEOGRAPHYvalue. If auto-detection is enabled, then BigQuery loads the dataas aSTRING value.

For example, you might have a CSV file that contains the following data:

"POLYGON((-124.49 47.35,-124.49 40.73,-116.49 40.73,-116.49 47.35,-124.49 47.35))",poly1"POLYGON((-85.6 31.66,-85.6 24.29,-78.22 24.29,-78.22 31.66,-85.6 31.66))",poly2"POINT(1 2)",point1

You can load this file by running the bq command-line toolload command:

bqload--source_format=CSV\--schema="geography:GEOGRAPHY,name:STRING"\mydataset.mytablefilename1.csv

For more information about loading data in BigQuery, seeIntroduction to loading data.

To stream WKT data to an existing BigQuery table with aGEOGRAPHY column, serialize the data as a string in the API request.

bq

Run the bq command-line toolinsert command:

echo '{"geo": "LINESTRING (-118.4085 33.9416, -73.7781 40.6413)"}' \    | bq insert my_dataset.geo_table

Python

Before trying this sample, follow thePython setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryPython API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

fromgoogle.cloudimportbigqueryimportshapely.geometryimportshapely.wktbigquery_client=bigquery.Client()# This example uses a table containing a column named "geo" with the# GEOGRAPHY data type.table_id="my-project.my_dataset.my_table"# Use the Shapely library to generate WKT of a line from LAX to# JFK airports. Alternatively, you may define WKT data directly.my_geography=shapely.geometry.LineString([(-118.4085,33.9416),(-73.7781,40.6413)])rows=[# Convert data into a WKT string.{"geo":shapely.wkt.dumps(my_geography)},]#  table already exists and has a column# named "geo" with data type GEOGRAPHY.errors=bigquery_client.insert_rows_json(table_id,rows)iferrors:raiseRuntimeError(f"row insert failed:{errors}")else:print(f"wrote 1 row to{table_id}")

For more information about streaming data in BigQuery, seeStreaming data into BigQuery.

You can also convert a WKT text string into aGEOGRAPHY value by using theST_GEOGFROMTEXTfunction.

Loading GeoJSON data

GeoJSON is a JSON-based format for geometries and spatial features. For example,the following defines a point in GeoJSON:

{"type":"Point","coordinates":[-121,41]}

GeoJSON data can contain any of the following object types:

  • Geometry objects. A geometry object is a spatial shape, described as a unionof points, lines, and polygons with optional holes.
  • Feature objects. A feature object contains a geometry plus additionalname/value pairs, whose meaning is application-specific.
  • Feature collections. A feature collection is a set of feature objects.

There are two ways to load GeoJSON data into BigQuery:

Loading newline-delimited GeoJSON files

A newline-delimited GeoJSON file contains a list of GeoJSON feature objects, oneper line in the file. A GeoJSON feature object is a JSON object with thefollowing members:

  • type. For feature objects, the value must beFeature.BigQuery validates the value but does not include it in thetable schema.

  • geometry. The value is a GeoJSONGeometry object ornull.BigQuery converts this member into aGEOGRAPHY value.

  • properties. The value is any JSON object or null. If the value isn'tnull,then BigQuery loads each member of the JSON object as aseparate table column. For more information about how BigQueryparses JSON data types, seeDetails of loading JSON data.

  • id. Optional. If present, the value is either a string or a number.BigQuery loads this value into a column namedid.

If the feature object contains other members that are not listed here, thenBigQuery converts those members directly into table columns.

You can load a newline-delimited GeoJSON file by using the bq command-line tool'sbqload command, as follows:

bq load \ --source_format=NEWLINE_DELIMITED_JSON \ --json_extension=GEOJSON \ --autodetect \DATASET.TABLE \FILE_PATH_OR_URI

Replace the following:

  • DATASET is the name of your dataset.
  • TABLE is the name of the destination table.
  • FILE_PATH_OR_URI is a path to a local file or aCloud Storage URI.

The previous example enablesschema auto-detection. For more control overhow BigQuery converts the values inside theproperties object,you can provide an explicit schema instead. For more information, seeSpecify schemas.If you provide an explicit schema, then don't include a top-leveltype columnin the schema definition. For each member of theproperties member, defineseparate columns, not a single nested column.

As defined byRFC 7946,a complete GeoJSON data structure is a single JSON object. Many systems exportGeoJSON data as a singleFeatureCollection object that contains all of thegeometries. To load this format into BigQuery, you must convertthe file by removing the root-levelFeatureCollection object and splitting theindividual feature objects into separate lines. For example, the followingcommand uses thejq command-line tool to split a GeoJSON file into newline-delimited format:

cat ~/file1.json | jq -c '.features[]' > converted.json

Creating an external table from a newline-delimited GeoJSON file

You can query a newline-delimited GeoJSON file stored in Cloud Storage bycreating anexternal table. To create theexternal table, use theCREATE EXTERNAL TABLEDDL statement. In theOPTIONS clause, set theformatoption toNEWLINE_DELIMITED_JSON and thejson_extension option toGEOJSON.

Example:

CREATEEXTERNALTABLEmydataset.table1OPTIONS(format="NEWLINE_DELIMITED_JSON",json_extension='GEOJSON',uris=['gs://mybucket/geofile.json']);

Loading GeoJSON geometry data

Geospatial analytics supports loading individual GeoJSON geometry objects that areembedded as text strings in other file types. For example, you can load a CSVfile where one of the columns contains a GeoJSON geometry object.

To load this type of GeoJSON data into BigQuery, provide aschema that specifies aGEOGRAPHY column for theGeoJSON data. You must manually provide the schema. Otherwise, if auto-detectionis enabled, then BigQuery loads the data as aSTRING value.

Geospatial analytics does not support loading GeoJSON feature objects or featurecollections using this approach. If you need to load feature objects, thenconsider using newline-delimited GeoJSON files.

To stream GeoJSON data to an existing BigQuery table with aGEOGRAPHY column, serialize the data as a string in the API request.

bq

Run the bq command-line toolinsert command:

echo'{"geo":"{\"type\": \"LineString\", \"coordinates\": [[-118.4085, 33.9416], [-73.7781, 40.6413]]}"}'\|bqinsertmy_dataset.geo_table

Python

Before trying this sample, follow thePython setup instructions in theBigQuery quickstart using client libraries. For more information, see theBigQueryPython API reference documentation.

To authenticate to BigQuery, set up Application Default Credentials. For more information, seeSet up authentication for client libraries.

importgeojsonfromgoogle.cloudimportbigquerybigquery_client=bigquery.Client()# This example uses a table containing a column named "geo" with the# GEOGRAPHY data type.table_id="my-project.my_dataset.my_table"# Use the python-geojson library to generate GeoJSON of a line from LAX to# JFK airports. Alternatively, you may define GeoJSON data directly, but it# must be converted to a string before loading it into BigQuery.my_geography=geojson.LineString([(-118.4085,33.9416),(-73.7781,40.6413)])rows=[# Convert GeoJSON data into a string.{"geo":geojson.dumps(my_geography)}]#  table already exists and has a column# named "geo" with data type GEOGRAPHY.errors=bigquery_client.insert_rows_json(table_id,rows)iferrors:raiseRuntimeError(f"row insert failed:{errors}")else:print(f"wrote 1 row to{table_id}")

You can also convert a GeoJSON geometry object into aGEOGRAPHY value by usingtheST_GEOGFROMGEOJSONfunction. For example, you can store the geometries asSTRING values and thenrun a query that callsST_GEOGFROMGEOJSON.

Loading GeoParquet files

GeoParquet is a specificationthat adds geospatial types to theParquet fileformat. GeoParquet includes metadata that provides definite semantics to thecontained geospatial data, avoiding theinterpretationissues that occur with other geospatial dataformats.

When loading Parquet files, BigQuery checks for GeoParquetmetadata. If GeoParquet metadata exists, BigQuery loads all ofthe columns it describes into a correspondingGEOGRAPHY column by default.For more information about loading Parquet files, seeLoading Parquet data.

Note: GeoParquet support is disabled for a few projects to avoidbreaking existing workflows. If your GeoParquet files aren't loadeddirectly toGEOGRAPHY columns,contact support.

Creating an external table from GeoParquet data

External tablesthat reference GeoParquet files map relevant columns to theGEOGRAPHYtype.

Statistics available in the GeoParquet file (bbox,covering) aren'tused to accelerate queries over an external table.

Coordinate systems and edges

In geospatial analytics, points are positions on the surface of a WGS84 spheroid,expressed as longitude and geodetic latitude. An edge is a spherical geodesicbetween two endpoints. (That is, edges are the shortest path on the surface ofa sphere.)

The WKT format does not provide a coordinate system. When loading WKT data,geospatial analytics assumes the data uses WGS84 coordinates with spherical edges.Make sure your source data matches that coordinate system, unless thegeographies are small enough that the difference between spherical and planaredges can be ignored.

GeoJSON explicitly uses WGS84 coordinates with planar edges. When loadingGeoJSON data, geospatial analytics converts planar edges to spherical edges.Geospatial analytics adds additional points to the line as necessary, so that theconverted sequence of edges remains within 10 meters of the original line. Thisprocess is known astessellation ornon-uniform densification. You cannotdirectly control the tessellation process.

To load geographies with spherical edges, use WKT. To load geographies withplanar edges, often calledgeometries, it's simplest to use GeoJSON. However,if your geometry data is already in WKT format, another option is to load thedata as aSTRING type and then use theST_GEOGFROMTEXTfunction to convert toGEOGRAPHY values. Set theplanar parameter toTRUEto interpret the data as planar.

GeoParquet files include metadata about the coordinate system and edges thatwere used to create the data. When reading GeoParquet files with planar edges,geospatial analytics converts planar edges to spherical edges. GeoParquet fileswith coordinate systems other than WGS84 are rejected.

When choosing an interchange format, be sure to understand the coordinate systemused by your source data. Most systems either explicitly support parsinggeography (as opposed to geometry) from WKT, or else they assume planar edges.

Your coordinates should be longitude first, latitude second. If the geographyhas any long segments or edges then they must be tessellated, becausegeospatial analytics interprets them as spherical geodesics, which may notcorrespond to the coordinate system where your data originated.

Polygon orientation

On a sphere, every polygon has a complementary polygon. For example, a polygonthat describes the Earth's continents would have a complementary polygonthat describes the Earth's oceans. Because the two polygons are described by thesame boundary rings, rules are required to resolve the ambiguity around whichof the two polygons is described by a given WKT string.

When you load WKT and WKB strings from files or by using streaming ingestion,geospatial analytics assumes the polygons in the input are oriented as follows:If you traverse the boundary of the polygon in the order of the inputvertices, the interior of the polygon is on the left. Geospatial analytics usesthe same rule when exporting geography objects to WKT and WKB strings.

If you use theST_GEOGFROMTEXTfunction to convert a WKT string to aGEOGRAPHY value, theorientedparameter specifies how the function determines the polygon:

  • FALSE: Interpret the input as the polygon with the smaller area. This is thedefault behavior.

  • TRUE: Use the left-hand orientation rule described previously. This optionallows you to load polygons with an area larger than a hemisphere.

Because GeoJSON strings are defined on a planar map, the orientation can bedetermined without ambiguity, even if the input does not follow the orientationrule defined in the GeoJSON format specification,RFC 7946.

Handling improperly formatted spatial data

When you load spatial data from other tools into BigQuery, youmight encounter conversion errors due to invalid WKT or GeoJSON data. Forexample, an error such asEdge K has duplicate vertex with edge N indicatesthat the polygon has duplicate vertices (besides the first and last).

To avoid formatting issues, you can use a function that generatesstandards-compliant output. For example, when you export data from PostGIS, youcan use the PostGISST_MakeValid function to standardize the output.Alternatively, import your data as text and then convert it by callingST_GEOGFROMTEXTorST_GEOGFROMGEOJSONwith themake_valid parameter. Whenmake_valid isTRUE, these functionsattempt to repair invalid polygons.

To find or to ignore the improperly formatted data, use theSAFE functionprefix to output the problematic data. For example, the following queryuses theSAFE prefix to retrieve improperly formatted spatial data.

SELECTgeojsonASbad_geojsonFROMmytableWHEREgeojsonISNOTNULLANDSAFE.ST_GEOGFROMGEOJSON(geojson)ISNULL

Constraints

Geospatial analytics does not support the following features in geospatialformats:

  • Three-dimensional geometries. This includes the "Z" suffix in the WKTformat, and the altitude coordinate in the GeoJSON format.
  • Linear reference systems. This includes the "M" suffix in WKT format.
  • WKT geometry objects other than geometry primitives or multipart geometries.In particular, geospatial analytics supports only Point, MultiPoint,LineString, MultiLineString, Polygon, MultiPolygon, and GeometryCollection.

SeeST_GEOGFROMGEOJSONandST_GEOGFROMTEXTfor constraints specific to GeoJson and WKT input formats.

Integrate geospatial raster data with Google Earth Engine

Geospatial insights are often presented as grid-based, orraster, data. Rasterdata organizes regionally continuous data, such as satellite imagery, weatherforecasts, and land cover, into a grid of pixels. AlthoughBigQuery primarily specializes in tabular vector data,representing features with boundaries and points, you can integrate raster datainto your BigQuery analyses by using theST_REGIONSTATS function.This function uses Earth Engine, Google's raster analysisplatform, to perform computations and aggregations on raster datafor enhanced geospatial analysis. For more information, seeWork with raster data.

For information about exporting Earth Engine data toBigQuery, seeExporting to BigQuery.For more information about integrations between Earth Engine andBigQuery, seeBigQuery integrationin the Earth Engine documentation.

Transforming geospatial data

If your table contains separate columns for longitude and latitude, you cantransform the values into geographies by using GoogleSQLgeography functionssuch asST_GEOGPOINT.For example, if you have twoDOUBLE columns for longitude and latitude, youcan create a geography column with the following query:

SELECT*,ST_GEOGPOINT(longitude,latitude)ASgFROMmytable

BigQuery can convert WKT and GeoJSON strings to geography types.If your data is in another format such as Shapefiles, use an external tool toconvert the data to a supported input file format, such as a CSV file, withGEOGRAPHY columns encoded as WKT or GeoJSON strings.

Partitioning and clustering geospatial data

You canpartition andcluster tables that containGEOGRAPHYcolumns. You can use aGEOGRAPHY column as a clustering column, but you cannotuse aGEOGRAPHY column as a partitioning column.

If you storeGEOGRAPHY data in a table and your queries filter data by using aspatial predicate, ensure that the table is clustered by theGEOGRAPHY column.This typically improves query performance and might reduce cost. A spatialpredicate calls a boolean geography function and has aGEOGRAPHY column as oneof the arguments. The following sample shows a spatial predicate that uses theST_DWITHIN function:

WHEREST_DWITHIN(geo,ST_GeogPoint(longitude,latitude),100)

Using JOINs with spatial data

Spatial JOINs are joins of two tables with a predicate geographic function intheWHERE clause. For example:

-- how many stations within 1 mile range of each zip code?SELECTzip_codeASzip,ANY_VALUE(zip_code_geom)ASpolygon,COUNT(*)ASbike_stationsFROM`bigquery-public-data.new_york.citibike_stations`ASbike_stations,`bigquery-public-data.geo_us_boundaries.zip_codes`ASzip_codesWHEREST_DWITHIN(zip_codes.zip_code_geom,ST_GEOGPOINT(bike_stations.longitude,bike_stations.latitude),1609.34)GROUPBYzipORDERBYbike_stationsDESC

Spatial joins perform better when your geography data is persisted. The exampleabove creates the geography values in the query. It is more performant to storethe geography values in a BigQuery table.

For example, the following query retrieves longitude, latitude pairs andconverts them to geographic points. When you run this query, you specify a newdestination table to store the query results:

SELECT*,ST_GEOGPOINT(pLongitude,pLatitude)ASpFROMmytable

BigQuery implements optimized spatial JOINs for INNER JOIN andCROSS JOIN operators with the following GoogleSQL predicate functions:

Spatial joins are not optimized:

  • ForLEFT,RIGHT orFULL OUTER joins
  • In cases involving anti-joins
  • When the spatial predicate is negated

AJOIN that uses theST_DWITHIN predicate is optimized only whenthe distance parameter is a constant expression.

Exporting spatial data

When you export spatial data from BigQuery,GEOGRAPHY columnvalues are always formatted as WKT strings. To export data in GeoJSON format,use theST_ASGEOJSONfunction.

If the tools you're using to analyze the exported data don't understand theGEOGRAPHY data type, you can convert the column values to strings using ageographic function such asST_ASTEXTorST_ASGEOJSON.Geospatial analytics adds additional points to the line where necessary so thatthe converted sequence of edges remains within 10 meters of the originalgeodesic line.

For example, the following query usesST_ASGEOJSON to convert GeoJSON valuesto strings.

SELECTST_ASGEOJSON(ST_MAKELINE(ST_GEOGPOINT(1,1),ST_GEOGPOINT(3,2)))

The resulting data would look like the following:

{"type":"LineString","coordinates":[[1,1],[1.99977145571783,1.50022838764041],[2.49981908082299,1.75018082434274],[3,2]]}

The GeoJSON line has two additional points. Geospatial analytics addsthese points so that the GeoJSON line closely follows the same path on theground as the original line.

What's next

Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2025-12-15 UTC.