dggridR.R
You want to do spatial statistics, and it’s going to involvebinning.
Binning with a rectangular grid introduces messy distortions. At themacro-scale using a rectangular grid does things like making Greenlandbigger than the United States and Antarctica the largest continent.
But this kind of distortion is present no matter what the resolutionis.
What you want are bins of equal size, regardless of where they are onthe globe, regardless of their resolution.
dggridR solves this problem.
dggridR builds discrete global grids which partition the surface ofthe Earth into hexagonal, triangular, or diamond cells,all ofwhich have the same size. (There are some minor details whichare detailed in theCaveats section below.)
This package includes everything you need to make spatial binninggreat again.
Many details are included in the vignette.
The following grids are available:
Unless you are using cells with very large areas (significantfractions of Earth’s hemispheres), I recommend the ISEA3H be yourdefault grid.
This grid, along with the other Icosahedral grids ensures that allcells are of equal area, with a notable exception. At every resolution,the Icosahedral grids contain 12 pentagonal cells which each have anarea exactly 5/6 that of the hexagonal cells. But you don’t need toworry about this too much for two reasons. (1) As the table below shows,these cells are a small, small minority of the total number of cells.(2) The grids are orientated so that these cells are in out-of-the-wayplaces. Future versions of this package will allow you to reorient thegrids, if need be. (TODO)
For more complex applications than simple spatial binning, it isnecessary to consider trade-offs between the different grids. Goodreferences for understanding these include(Kimerling et al. 1999; Gregory et al.2008).
Users attempting multi-scale analyses should be aware that in thehexagonal grids cells from one resolution level are partially containedby the cells of other levels.
At present, there is no convenient way to convert grid cell ids atone resolution level to another. In the future, I hope to add thiscapability to the package. (TODO)
The following table shows the number of cells, their area, andstatistics regarding the spacing of their center nodes for the ISEA3Hgrid type.
| Res | Number of Cells | Cell Area (km^2) | Min | Max | Mean | Std |
|---|---|---|---|---|---|---|
| 0 | 12 | 51,006,562.17241 | ||||
| 1 | 32 | 17,002,187.39080 | 4,156.18000 | 4,649.10000 | 4,320.49000 | 233.01400 |
| 2 | 92 | 5,667,395.79693 | 2,324.81000 | 2,692.72000 | 2,539.69000 | 139.33400 |
| 3 | 272 | 1,889,131.93231 | 1,363.56000 | 1,652.27000 | 1,480.02000 | 89.39030 |
| 4 | 812 | 629,710.64410 | 756.96100 | 914.27200 | 855.41900 | 52.14810 |
| 5 | 2,432 | 209,903.54803 | 453.74800 | 559.23900 | 494.95900 | 29.81910 |
| 6 | 7,292 | 69,967.84934 | 248.80400 | 310.69300 | 285.65200 | 17.84470 |
| 7 | 21,872 | 23,322.61645 | 151.22100 | 187.55000 | 165.05800 | 9.98178 |
| 8 | 65,612 | 7,774.20548 | 82.31100 | 104.47000 | 95.26360 | 6.00035 |
| 9 | 196,832 | 2,591.40183 | 50.40600 | 63.00970 | 55.02260 | 3.33072 |
| 10 | 590,492 | 863.80061 | 27.33230 | 35.01970 | 31.75960 | 2.00618 |
| 11 | 1,771,472 | 287.93354 | 16.80190 | 21.09020 | 18.34100 | 1.11045 |
| 12 | 5,314,412 | 95.97785 | 9.09368 | 11.70610 | 10.58710 | 0.66942 |
| 13 | 15,943,232 | 31.99262 | 5.60065 | 7.04462 | 6.11367 | 0.37016 |
| 14 | 47,829,692 | 10.66421 | 3.02847 | 3.90742 | 3.52911 | 0.22322 |
| 15 | 143,489,072 | 3.55473 | 1.86688 | 2.35058 | 2.03789 | 0.12339 |
| 16 | 430,467,212 | 1.18491 | 1.00904 | 1.30335 | 1.17638 | 0.07442 |
| 17 | 1,291,401,632 | 0.39497 | 0.62229 | 0.78391 | 0.67930 | 0.04113 |
| 18 | 3,874,204,892 | 0.13166 | 0.33628 | 0.43459 | 0.39213 | 0.02481 |
| 19 | 11,622,614,672 | 0.04389 | 0.20743 | 0.26137 | 0.22643 | 0.01371 |
| 20 | 34,867,844,012 | 0.01463 | 0.11208 | 0.14489 | 0.13071 | 0.00827 |
Construct a discrete global grid system (dggs) object usingdgconstruct()
Get information about your dggs object using:
dggetres()dginfo()dgmaxcell()Get the grid cells of some lat-long points with:
dgGEO_to_SEQNUM()Get the boundaries of the associated grid cells for use inplotting with:
dgcellstogrid()dgearthgrid()dgrectgrid()dgshptogrid()Check that your dggs object is valid (if you’ve mucked with it)using:
dgverify()The following example demonstrates converting lat-long locations (theepicenters of earthquakes) to discrete global grid locations (cellnumbers), binning based on these numbers, and plotting the result.Additionally, the example demonstrates how to get the center coordinatesof the cells.
#Include librarieslibrary(dggridR)library(collapse)#Construct a global grid with cells approximately 1000 miles acrossdggs<-dgconstruct(spacing=1000,metric=FALSE,resround='down')#Load included test data setdata(dgquakes)#Get the corresponding grid cells for each earthquake epicenter (lat-long pair)dgquakes$cell<-dgGEO_to_SEQNUM(dggs,dgquakes$lon,dgquakes$lat)$seqnum#Converting SEQNUM to GEO gives the center coordinates of the cellscellcenters<-dgSEQNUM_to_GEO(dggs,dgquakes$cell)#Get the number of earthquakes in each cellquakecounts<- dgquakes|>fcount(cell,name ="count")#Get the grid cell boundaries for cells which had quakesgrid<-dgcellstogrid(dggs,quakecounts$cell)#Update the grid cells' properties to include the number of earthquakes#in each cellgrid<-merge(grid,quakecounts,by.x="seqnum",by.y="cell")#Make adjustments so the output is more visually interestinggrid$count<-log(grid$count)cutoff<-fquantile(grid$count,0.9)grid<- grid|>fmutate(count =ifelse(count>cutoff,cutoff,count))#Get polygons for each country of the worldcountries<-map_data("world")dggridR.R
Okay, let’s draw the plot. Notice how the hexagons appear to be alldifferent sizes. Really, though, they’re not: that’s just the effect oftrying to plot a sphere on a flat surface! And that’s what would happento your data if you didn’t use this package :-)
#Plot everything on a flat map# Handle cells that cross 180 degreeswrapped_grid=st_wrap_dateline(grid,options =c("WRAPDATELINE=YES","DATELINEOFFSET=180"),quiet =TRUE)## Warning in CPL_wrap_dateline(st_geometry(x), options, quiet): GDAL Error 1:## IllegalArgumentException: Points of LinearRing do not form a closed linestring## Warning in CPL_wrap_dateline(st_geometry(x), options, quiet): GDAL Error 1:## IllegalArgumentException: Points of LinearRing do not form a closed linestringggplot()+geom_polygon(data=countries,aes(x=long,y=lat,group=group),fill=NA,color="black")+geom_sf (data=wrapped_grid,aes(fill=count),color=alpha("white",0.4))+geom_point (aes(x=cellcenters$lon_deg,y=cellcenters$lat_deg))+scale_fill_gradient(low="blue",high="red")dggridR.R
dggridR.R
–>
You can also write out a KML file with your data included fordisplaying in, say, Google Earth:
library(sf)#Get the grid cell boundaries for the whole Earth using this dggs in a form#suitable for printing to a KML filegrid<-dgearthgrid(dggs)#Update the grid cells' properties to include the number of earthquakes#in each cellgrid$count<-merge(grid, quakecounts,by.x="seqnum",by.y="cell",all.x=TRUE)#Write out the gridst_write(grid,"quakes_per_cell.kml",layer="quakes",driver="KML")dggridR.R
Say you want to sampleN areas of equal size uniformlydistributed on the Earth. dggridR provides two possible ways toaccomplish this. The conceptually simplest is to chooseNuniformly distributed lat-long pairs and retrieve their associated gridcells:
#Include librarieslibrary(dggridR)N<-100#How many cells to sample#Distribute the points uniformly on a sphere using equations from#http://mathworld.wolfram.com/SpherePointPicking.htmlu<-runif(N)v<-runif(N)theta<-2*pi*u*180/piphi<-acos(2*v-1)*180/pilon<- theta-180lat<- phi-90df<-data.frame(lat=lat,lon=lon)#Construct a global grid in which every hexagonal cell has an area of#100,000 miles^2. You could, of course, choose a much smaller value, but these#will show up when I map them later.#Note: Cells can only have certain areas, the `dgconstruct()` function below#will tell you which area is closest to the one you want. You can also round#up or down.#Note: 12 cells are actually pentagons with an area 5/6 that of the hexagons#But, with millions and millions of hexes, you are unlikely to choose one#Future versions of the package will make it easier to reject the pentagonsdggs<-dgconstruct(area=100000,metric=FALSE,resround='nearest')#Get the corresponding grid cells for each randomly chosen lat-longdf$cell<-dgGEO_to_SEQNUM(dggs,df$lon,df$lat)$seqnum#Get the hexes for each of these cellsgridfilename<-dgcellstogrid(dggs,df$cell)dggridR.R
The resulting distribution of cells appears as follows:
#Get the grid in a more convenient formatgrid<-dgcellstogrid(dggs,df$cell)grid<-st_wrap_dateline(grid,options =c("WRAPDATELINE=YES","DATELINEOFFSET=180"),quiet =TRUE)#Get polygons for each country of the worldcountries<-map_data("world")#Plot everything on a flat mapp<-ggplot()+geom_polygon(data=countries,aes(x=long,y=lat,group=group),fill=NA,color="black")+geom_sf(data=grid,fill=alpha("green",alpha=0.4),color=alpha("white",alpha=0.4))pdggridR.R
Say you want to sampleN areas of equal size uniformlydistributed on the Earth. dggridR provides two possible ways toaccomplish this. The easiest way to do this is to note that grid cellsare labeled from 1 toM, whereM is thelargest cell id at the resolution in question. Therefore, we can samplecell ids and generate a grid accordingly.
#Include librarieslibrary(dggridR)N<-100#How many cells to sample#Construct a global grid in which every hexagonal cell has an area of#100,000 miles^2. You could, of course, choose a much smaller value, but these#will show up when I map them later.#Note: Cells can only have certain areas, the `dgconstruct()` function below#will tell you which area is closest to the one you want. You can also round#up or down.#Note: 12 cells are actually pentagons with an area 5/6 that of the hexagons#But, with millions and millions of hexes, you are unlikely to choose one#Future versions of the package will make it easier to reject the pentagonsdggs<-dgconstruct(area=100000,metric=FALSE,resround='nearest')maxcell<-dgmaxcell(dggs)#Get maximum cell idcells<-sample(1:maxcell, N,replace=FALSE)#Choose random cellsgrid<-dgcellstogrid(dggs,cells)#Get griddggridR.R
The resulting distribution of cells appears as follows:
#Get the grid in a more convenient formatgrid<-dgcellstogrid(dggs,df$cell)grid<-st_wrap_dateline(grid,options =c("WRAPDATELINE=YES","DATELINEOFFSET=180"),quiet =TRUE)#Get polygons for each country of the worldcountries<-map_data("world")#Plot everything on a flat mapp<-ggplot()+geom_polygon(data=countries,aes(x=long,y=lat,group=group),fill=NA,color="black")+geom_sf(data=grid,fill=alpha("green",0.4),color=alpha("white",0.4))pdggridR.R
Sometimes you want to use a grid in software other than R. Tofacilitate this, the grid generation commands include thesavegrid argument, as demonstrated below.
library(dggridR)#Generate a global grid whose cells are ~100,000 miles^2dggs<-dgconstruct(area=100000,metric=FALSE,resround='nearest')#Save the cells to a KML file for use in other softwaregridfilename<-dgearthgrid(dggs,savegrid=tempfile())dggridR.R
library(dggridR)#Generate a dggs specifying an intercell spacing of ~25 milesdggs<-dgconstruct(spacing=100,metric=FALSE,resround='nearest')#Read in the South Africa's borders from the shapefilesa_border<-st_read(dg_shpfname_south_africa(),layer="ZAF_adm0")st_crs(sa_border)=4326#Get a grid covering South Africasa_grid<-dgshptogrid(dggs,dg_shpfname_south_africa())#Plot South Africa's borders and the associated gridp<-ggplot()+geom_sf(data=sa_border,fill=NA,color="black")+geom_sf(data=sa_grid,fill=alpha("blue",0.4),color=alpha("white",0.4))pdggridR.R
At every resolution, the Icosahedral grids contain 12 pentagonalcells which each have an area exactly 5/6 that of the hexagonal cells.In the standard orientation, these are located as follows (scaled to asize corresponding to the grid resolution):
dggridR.R
Method to convert between grid cell ids at differentresolutions
In the future, I plan to switch the package from using KevinSahr’s dggrid software to the discrete global grid system standardscurrently being developed by OpenGeospatial. Those standards are beingdeveloped by asoftware workinggroup right now, but will one day be released. At that point, Iexpect that GDAL/OGR/PROJ4 will incorporate the new standards makingwider interoperability possible. Until that time, Sahr’s dggrid is thebest option I’ve found.
This R package was developed by Richard Barnes (https://rbarnes.org/).
The dggrid conversion software was developed predominantly by KevinSahr (https://discreteglobal.wpengine.com/), withcontributions from a few others.
Large portions of the above documentation are drawn from the DGGRIDversion 6.2b User Documentation, which is available in its entiretyhere.
This packageshould operate in the manner described here, inthe package’s main documentation, and in Kevin Sahr’s dggriddocumentation. Unfortunately, none of us are paid enough to makeabsolutely, doggone certain that that’s the case. Use at your owndiscretion. That said, if you find bugs or are seeking enhancements, wewant to hear about them.
Please cite this package as:
Richard Barnes and Kevin Sahr (2017). dggridR: Discrete Global Gridsfor R. R package version 2.0.4. “https://github.com/r-barnes/dggridR/”