NotificationsYou must be signed in to change notification settings
Fork4
Star18

Commit7adaacc

committed

Add zonal-statistics v0.1

1 parentbaa175a commit7adaaccCopy full SHA for 7adaacc

File tree

6 files changed

+279

-0

lines changed

inst/reports/national/zonal-statistics

6 files changed

+279

-0

lines changed

`‎inst/reports/national/zonal-statistics/NEWS.md‎`

Lines changed: 2 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+#zonal-statistics 0.1 (2025-08-05)`
	`2`	`+- A new report for calculating zonal statistics with exactextract using raster data and polygons as zones.`

`‎inst/reports/national/zonal-statistics/README.md‎`

Lines changed: 65 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,65 @@`
	`1`	`+#Zonal Statistics Report`
	`2`	`+`
	`3`	+This report calculates zonal statistics for specified polygons using raster data. The script leverages the`terra` and`exactextractr` packages to perform spatial operations and extract statistics.
	`4`	`+`
	`5`	`+##Prerequisites`
	`6`	`+`
	`7`	`+Ensure you have the following R packages installed:`
	`8`	`+`
	`9`	+-`terra`
	`10`	`+`
	`11`	+-`exactextractr`
	`12`	`+`
	`13`	`+You can install them using:`
	`14`	+```r
	`15`	`+reportSetup("national/zonal-statistics")`
	`16`	+```
	`17`	`+`
	`18`	`+##Script Overview`
	`19`	`+`
	`20`	`+###Configuration`
	`21`	`+`
	`22`	`+-mu.dsn: Path to the parent folder of the shapefile (SHP), without a trailing forward slash.`
	`23`	`+-mu.layer: Name of the SHP file, without the file extension.`
	`24`	`+-mu.col: Column name in the SHP file to be used for grouping (e.g., 'MUKEY', 'MUSYM').`
	`25`	`+-mu.set: Vector of symbols of interest to subset the polygons.`
	`26`	`+-raster.list: Nested list of raster data categorized as continuous, categorical, or circular.`
	`27`	`+-FUN: Aggregation method to be used (e.g., "quantile").`
	`28`	`+-ARGS: Additional arguments for the aggregation function.`
	`29`	+-BY_POLYGON: Boolean indicating whether to apply aggregation to each polygon (`TRUE`) or each group (`FALSE`).
	`30`	`+`
	`31`	`+###Script Steps`
	`32`	`+`
	`33`	`+1.Load and Subset Polygons:`
	`34`	`+- Load the polygons from the specified vector data file (e.g. Shapefile)`
	`35`	`+- Subset the polygons based on the symbols of interest`
	`36`	`+`
	`37`	`+2.Create SpatRasters:`
	`38`	+- Convert the raster file paths into`SpatRaster` objects
	`39`	`+`
	`40`	`+3.Aggregate Polygons:`
	`41`	`+- Aggregate the polygons if group statistics are required`
	`42`	`+`
	`43`	`+4.Prepare Spatial Object and Attributes:`
	`44`	+- Convert the polygons to an`sf` object
	`45`	`+- Calculate the area of each polygon in acres`
	`46`	`+`
	`47`	`+5.Extract and Apply Aggregation Function:`
	`48`	`+- Extract statistics from the raster data using the specified aggregation function`
	`49`	`+- Combine the results with the polygon attributes`
	`50`	`+`
	`51`	`+###Output`
	`52`	`+`
	`53`	+The script outputs a data frame with the extracted statistics for each polygon or group, depending on the`BY_POLYGON` parameter.
	`54`	`+`
	`55`	`+##Usage`
	`56`	`+`
	`57`	+1. Run`soilReports::reportInit("national/zonal-statistics", outputDir = "zonal-test")` to install a report instance in`outputDir`. Specify`overwrite` argument as needed. Use`reportUpdate` if an existing older report instance is being updated and you want to preserve_config.R_ contents.
	`58`	`+`
	`59`	+2. Navigate to your report instance in`outputDir` (e.g.`"zonal-test"` directory) and inspect folder contents.
	`60`	`+`
	`61`	`+3. Open_config.R_ and set the paths to data sources and other configuration`
	`62`	`+`
	`63`	+4. Open_report.Rmd_ in RStudio and click "Knit" button, or`render()` with {rmarkdown} manually.
	`64`	`+`
	`65`	+5. View HTML report output, and tabular data files generated in`"output"` subfolder

`‎inst/reports/national/zonal-statistics/config.R‎`

Lines changed: 33 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,33 @@`
	`1`	`+# config.R`
	`2`	`+`
	`3`	`+# path to polygon source`
	`4`	`+# could be e.g. parent folder of SHP, or path to file geodatabase or geopackage`
	`5`	`+MU_DSN<-'D:/RGISS/S26_SWFS_LTSD.gdb'`
	`6`	`+`
	`7`	`+# layer name (without file extension)`
	`8`	`+MU_LAYER<-'MUPOLYGON'`
	`9`	`+`
	`10`	`+# could be 'MUKEY', 'MUSYM', or any column name in MU_LAYER`
	`11`	`+MU_COL<-'MUSYM'`
	`12`	`+`
	`13`	`+# symbols of interest (used to subset MU_LAYER using MU_COL)`
	`14`	`+MU_SET<- c("201","205","212","213")`
	`15`	`+`
	`16`	`+# raster data (path to CSV file that contains columns: variable, path and digits)`
	`17`	`+RASTER_DATA_FILE<-"raster-data.csv"`
	`18`	`+`
	`19`	`+# aggregation methods (see ?exactextractr::exact_extract for details)`
	`20`	`+# FUN <- c("mean", "stdev")`
	`21`	`+# FUN <- "quantile"`
	`22`	`+FUN<- c("mean","stdev","min","quantile","max")`
	`23`	`+`
	`24`	`+# aggregation additional arguments`
	`25`	`+ARGS<-list(quantiles= c(0.1,0.5,0.9))`
	`26`	`+`
	`27`	`+# apply aggregation to each polygon (TRUE) or each mu.col group (FALSE)?`
	`28`	`+BY_POLYGON<-FALSE`
	`29`	`+`
	`30`	`+# default output directory and file name`
	`31`	`+OUTPUT_DIR<-"output"`
	`32`	`+OUTPUT_BASENAME<- paste0("zonal-stats-",MU_LAYER,"-",MU_COL,"-", as.numeric(Sys.time()))`
	`33`	`+OUTPUT_REPORT_NAME<- paste0(OUTPUT_BASENAME,".html")`

`‎inst/reports/national/zonal-statistics/raster-data.csv‎`

Lines changed: 8 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,8 @@`
	`1`	`+"variable","path","digits"`
	`2`	`+"Mean Annual Air Temperature (degrees C)","D:/Geodata/project_data/MUSum_PRISM/final_MAAT_800m.tif",1`
	`3`	`+"Mean Annual Precipitation (mm)","D:/Geodata/project_data/MUSum_PRISM/final_MAP_mm_800m.tif",0`
	`4`	`+"Effective Precipitation (mm)","D:/Geodata/project_data/MUSum_PRISM/effective_precipitation_800m.tif",0`
	`5`	`+"Frost-Free Days","D:/Geodata/project_data/MUSum_PRISM/ffd_50_pct_800m.tif",0`
	`6`	`+"Growing Degree Days (degrees C)","D:/Geodata/project_data/MUSum_PRISM/gdd_mean_800m.tif",0`
	`7`	`+"Elevation (m)","D:/workspace/S26-SWFS-C5/raster/covariates/genelev_2_30m.tif",0`
	`8`	`+"Slope Gradient (%)","D:/workspace/S26-SWFS-C5/raster/covariates/slope_2_30m.tif",1`

`‎inst/reports/national/zonal-statistics/report.Rmd‎`

Lines changed: 143 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,143 @@`
	`1`	`+---`
	`2`	`+title:"Zonal Statistics Report"`
	`3`	`+output:html_document`
	`4`	`+knit:(function(inputFile, encoding) {`
	`5`	`+source("config.R");`
	`6`	`+rmarkdown::render(`
	`7`	`+inputFile,`
	`8`	`+encoding = encoding,`
	`9`	`+output_file = OUTPUT_REPORT_NAME`
	`10`	`+)})`
	`11`	`+---`
	`12`	`+`
	`13`	+```{r setup, include = FALSE}
	`14`	`+knitr::opts_chunk$set(echo = FALSE)`
	`15`	`+`
	`16`	`+library(terra)`
	`17`	`+library(exactextractr)`
	`18`	`+`
	`19`	`+source("config.R")`
	`20`	`+`
	`21`	`+if (!file.exists(RASTER_DATA_FILE)) {`
	`22`	`+ stop("File '", RASTER_DATA_FILE, "' does not exist! This should be a table with three columns describing the (formatted) variable name, the data source name of the file, and the number of digits to display after the decimal point in aggregate results.")`
	`23`	`+}`
	`24`	+```
	`25`	`+`
	`26`	`+This report calculates user-specified[percentiles](https://en.wikipedia.org/wiki/Percentile) for raster data within polygons.`
	`27`	`+`
	`28`	+Percentiles are an excellent way to describe range and central tendency for a dataset. Spatially-weighted percentiles, via the`exactextractr::exact_extract()` method, allow us to account for fractional overlap of polygons and individual raster cells.
	`29`	`+`
	`30`	+```{r calculate-statistics}
	`31`	`+# load and subset polygons`
	`32`	`+mupolygon <- terra::vect(MU_DSN, MU_LAYER)`
	`33`	`+mupolygon.sub <- mupolygon[which(mupolygon[[MU_COL]][[1]] %in% as.character(MU_SET)), ]`
	`34`	`+mupolygon.sub$Acres <- terra::expanse(mupolygon.sub) / 4086.46`
	`35`	`+`
	`36`	`+raster.data <- read.csv(RASTER_DATA_FILE)`
	`37`	`+`
	`38`	`+if (!"path" %in% colnames(raster.data)) {`
	`39`	`+ stop("File '", RASTER_DATA_FILE, "' must specify a column called 'path' containing the file path or GDAL data source specification for a raster source")`
	`40`	`+}`
	`41`	`+`
	`42`	`+if (!"variable" %in% colnames(raster.data)) {`
	`43`	`+ stop("File '", RASTER_DATA_FILE, "' must specify a column called 'variable' containing formatted names for each raster source")`
	`44`	`+}`
	`45`	`+`
	`46`	`+if (!"digits" %in% colnames(raster.data)) {`
	`47`	`+ stop("File '", RASTER_DATA_FILE, "' must specify a column called 'digits' containing the number of digits after the decimal place to use for rounding aggregate outputs")`
	`48`	`+}`
	`49`	`+`
	`50`	`+# create spatrasters`
	`51`	`+spatraster.list <- lapply(raster.data$path, terra::rast)`
	`52`	`+`
	`53`	`+# this aggregation is only needed if you want group statistics`
	`54`	`+if (!BY_POLYGON) {`
	`55`	`+ mupolygon.group <- terra::aggregate(mupolygon.sub, MU_COL, fun = "sum")`
	`56`	`+} else {`
	`57`	`+ mupolygon.group <- mupolygon.sub`
	`58`	`+}`
	`59`	`+`
	`60`	`+# prepare sf object and final attributes`
	`61`	`+mupolygon.sf <- sf::st_as_sf(mupolygon.group)`
	`62`	`+mupolygon.attr <- as.data.frame(mupolygon.sf)[MU_COL]`
	`63`	`+`
	`64`	`+# name cleaning function (TODO: cover more edge cases)`
	`65`	`+makeName <- function(x) {`
	`66`	`+ gsub("[^a-z0-9 ]", "_", make.names(tolower(x), unique = TRUE))`
	`67`	`+}`
	`68`	`+`
	`69`	`+# extract and apply aggregation function`
	`70`	`+sampling.time <- system.time({`
	`71`	`+ res <- lapply(seq_len(length(spatraster.list)), function(i) {`
	`72`	`+ r <- spatraster.list[[i]]`
	`73`	`+ ressub <- do.call(exactextractr::exact_extract, c(list(`
	`74`	`+ x = r,`
	`75`	`+ y = sf::st_transform(mupolygon.sf, terra::crs(r)),`
	`76`	`+ fun = FUN,`
	`77`	`+ progress = FALSE,`
	`78`	`+ force_df = TRUE,`
	`79`	`+ weights = 'area' # TODO: allow custom weight raster?`
	`80`	`+ ), ARGS))`
	`81`	`+ d <- raster.data[i, 'digits']`
	`82`	`+ if (!is.na(d)) {`
	`83`	`+ ressub <- round(ressub, d)`
	`84`	`+ }`
	`85`	`+ colnames(ressub) <- paste0(makeName(names(r)), "_", names(ressub))`
	`86`	`+ cbind(mupolygon.attr, ressub)`
	`87`	`+ })`
	`88`	`+})`
	`89`	`+names(res) <- raster.data$variable`
	`90`	`+`
	`91`	+mupolygon.attr$`Polygon Count` <- aggregate(seq_len(nrow(mupolygon.sub)), by = list(mupolygon.sub[[MU_COL]][[1]]), length)$x
	`92`	+mupolygon.attr$`Total Acres` <- round(mupolygon.group$sum_Acres)
	`93`	`+`
	`94`	`+poly_acres <- aggregate(mupolygon.sub$Acres, by = list(mupolygon.sub[[MU_COL]][[1]]),`
	`95`	`+ function(x) setNames(round(quantile(x)), paste0("Q", c(0, 25, 50, 75, 100))))`
	`96`	`+colnames(poly_acres)[[1]] <- MU_COL`
	`97`	`+colnames(poly_acres[[2]]) <- paste("Acres", colnames(poly_acres[[2]]))`
	`98`	`+poly_summary <- cbind(mupolygon.attr, poly_acres[[2]])`
	`99`	+```
	`100`	`+`
	`101`	`+##Input Data Summary`
	`102`	`+`
	`103`	+Input polygon boundary groups from '`r MU_DSN`' '`r MU_LAYER`' are as follows:
	`104`	`+`
	`105`	+```{r polygon-table}
	`106`	`+knitr::kable(poly_summary, caption = "Polygon Area Summary")`
	`107`	+```
	`108`	`+`
	`109`	`+Rasters were summarized from the following data sources:`
	`110`	`+`
	`111`	+```{r raster-table}
	`112`	`+knitr::kable(raster.data, caption = "Raster Data Sources")`
	`113`	+```
	`114`	`+`
	`115`	+Completed raster extraction from`r sum(poly_summary[["Polygon Count"]])` polygons grouped by by "`r MU_COL`" in`r structure(sampling.time[3], class = "difftime", units = "secs")` seconds.
	`116`	`+`
	`117`	`+##Results`
	`118`	`+`
	`119`	+```{r display-html-tables, results='asis'}
	`120`	`+for (n in names(res))`
	`121`	`+ print(knitr::kable(res[[n]], caption = paste("Zonal Statistics of", n, "by", MU_COL)))`
	`122`	+```
	`123`	`+`
	`124`	+```{r write-wide-output, results='hide'}
	`125`	`+# create wide format output`
	`126`	`+.merge <- \(x, y) merge(x, y, by = c("MUSYM", "Acres"))`
	`127`	`+res2 <- Reduce(merge, res)`
	`128`	`+`
	`129`	`+## too wide for html format`
	`130`	`+# knitr::kable(res2, caption = paste("Zonal Statistics by", MU_COL))`
	`131`	`+`
	`132`	`+dir.create(OUTPUT_DIR, showWarnings = FALSE, recursive = TRUE)`
	`133`	`+outfile <- normalizePath(file.path(OUTPUT_DIR, paste0(OUTPUT_BASENAME)),`
	`134`	`+ winslash = "/",`
	`135`	`+ mustWork = FALSE)`
	`136`	`+foo <- write.csv(res2, paste0(outfile, ".csv"), row.names = FALSE)`
	`137`	`+vct <- terra::writeVector(cbind(mupolygon.group, res2), paste0(outfile, ".gpkg"))`
	`138`	+```
	`139`	`+`
	`140`	`+Wide-format output written to file:`
	`141`	`+`
	`142`	+-`r paste0(outfile, ".csv")`
	`143`	+-`r paste0(outfile, ".gpkg")`

`‎inst/reports/national/zonal-statistics/setup.R‎`

Lines changed: 28 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,28 @@`
	`1`	`+##`
	`2`	`+## Internal setup file for report --`
	`3`	+## change values as needed but ensure all `.variable.names` are defined for each report
	`4`	`+##`
	`5`	`+`
	`6`	`+## packages from CRAN`
	`7`	`+.packages.to.get<- c('terra','sf','exactextractr')`
	`8`	`+`
	`9`	`+## packages from GitHub, installed with no dependencies`
	`10`	`+.gh.packages.to.get<- c()`
	`11`	`+`
	`12`	+# name of report (matches parent folder name; e.g. `inst/reports/templates/minimal`)
	`13`	`+.report.name<-'zonal-statistics'`
	`14`	`+`
	`15`	`+# version of report`
	`16`	`+.report.version<-'0.1'`
	`17`	`+`
	`18`	+# brief description for `soilReports::listReports()`
	`19`	`+.report.description<-'Calculate zonal statistics for raster data, using polygons and unique symbols as zones'`
	`20`	`+`
	`21`	`+# these are the files to copy on initial installation with copyReport/reportInit`
	`22`	`+.paths.to.copy<- c('report.Rmd','config.R','raster-data.csv','NEWS.md')`
	`23`	`+`
	`24`	`+# these are the files to copy on reportUpdate`
	`25`	`+.update.paths.to.copy<- c('report.Rmd','NEWS.md')`
	`26`	`+`
	`27`	+# this is a flag to denote whether `shiny.Rmd` is main entrypoint to report
	`28`	`+.has.shiny.interface<-FALSE`

0 commit comments

Comments

(0)

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit7adaacc

File tree

6 files changed

6 files changed

`‎inst/reports/national/zonal-statistics/NEWS.md‎`

`‎inst/reports/national/zonal-statistics/README.md‎`

`‎inst/reports/national/zonal-statistics/config.R‎`

`‎inst/reports/national/zonal-statistics/raster-data.csv‎`

`‎inst/reports/national/zonal-statistics/report.Rmd‎`

`‎inst/reports/national/zonal-statistics/setup.R‎`

0 commit comments