| Title: | Air Quality Evaluation |
| Version: | 0.6.2 |
| Date: | 2025-07-12 |
| Description: | Developed for use by those tasked with the routine detection, characterisation and quantification of discrete changes in air quality time-series, such as identifying the impacts of air quality policy interventions. The main functions use signal isolation then break-point/segment (BP/S) methods based on 'strucchange' and 'segmented' methods to detect and quantify change events (Ropkins & Tate, 2021, <doi:10.1016/j.scitotenv.2020.142374>). |
| Maintainer: | Karl Ropkins <k.ropkins@its.leeds.ac.uk> |
| License: | GPL (≥ 3) |
| Encoding: | UTF-8 |
| LazyData: | true |
| Depends: | R (≥ 3.5.0) |
| Imports: | openair, dplyr, loa, ggplot2, strucchange, segmented, mgcv,tidyr, lubridate, purrr, ggtext, stats, data.table |
| RoxygenNote: | 7.3.2 |
| URL: | https://github.com/karlropkins/AQEval,https://karlropkins.github.io/AQEval/ |
| BugReports: | https://github.com/karlropkins/AQEval/issues |
| Suggests: | tinytest |
| NeedsCompilation: | no |
| Packaged: | 2025-07-12 14:20:02 UTC; trakradmin |
| Author: | Karl Ropkins |
| Repository: | CRAN |
| Date/Publication: | 2025-07-14 19:10:18 UTC |
Air Quality Evaluation
Description
R AQEval: R code for the analysis of discretechange in Air Quality time-series.
AQEval
AQEval was developed for use by those tasked withthe routine detection, characterisation and quantificationof discrete changes in air quality time-series.
The main functions,quantBreakPointsandquantBreakSegments, usebreak-point/segment (BP/S) methodsbased on the consecutive use of methods in thestrucchange andsegmentedR packagesto first detection (as break-points) and then characteriseand quantify (as segments), discrete changes inair-quality time-series.
AQEval functions adopt anopenair-friendlyapproach using function and data structures that manyin the air quality research community are already familiarwith.Most notably, most functions expect supplied datato be time-series, to be supplied as a singledata.frame (or similar R object), and fortime-series to be identified by column names.The main functions are typically structured expectfirst thedata.frame, then the name of thepollutant to be used, then other arguments:
function(data, "polluant.name", ...)
output <- function(data, "polluant.name", ...)
Author(s)
Karl Ropkins
References
Ropkins, K. and Tate, J.E., 2021. Early observations on the impact of theCOVID-19 lockdown on air quality trends across the UK. Science of theTotal Environment, 754, p.142374.https://doi.org/10.1016/j.scitotenv.2020.142374
Ropkins, K., Tate, J.E., Walker, A. and Clark, T., 2022. Measuring theimpact of air quality related interventions. Environmental Science:Atmospheres, 2(3), pp.500-516. https://doi.org/10.1039/d1ea00073j
Ropkins, K., Walker, A., Philips, I., Rushton, C., Clark, T. andTate, J., Change Detection of Air Quality Time-Series Using theR Package AEQval. Available at SSRN 4267722.https://ssrn.com/abstract=4267722 or http://dx.doi.org/10.2139/ssrn.4267722Also at: https://karlropkins.github.io/AQEval/articles/AQEval_Intro_Preprint.pdf
See Also
For more about data structure and an example data set,seeAQEval.data
For more about the main functions, seequantBreakPointsandquantBreakSegments
AQEval Example data
Description
Data packaged with AQEval foruse with example code.
Usage
aq.dataFormat
(26280x6) 'tbl_df' objects
- date
Time-series of POSIX class date and time records.
- no2
Time-series ofnitrogen dioxide measurements from local site.
- bg.no2
Time-series ofnitrogen dioxide measurements from nearbybackground site.
- ws
Time-series oflocal wind speed measurements.
- wd
Time-series oflocal wind direction measurements.
- air_temp
Time-series oflocal air temperature measurements.
Details
Most of functions inAQEval adopt theopenair convention of assuming supplied data isa singledata.frame or similar.The data frame was initially adopted for two reasons:
Firstly, air quality data collected and archivedin numerous formats and keeping the import requirementssimple minimises the frustrations associated with dataimportation.
Secondly, restricting the user to work with a singledata format greatly simplifies data management forthose less familiar with programming environments.
As part of this work severalopenair codingconventions were adopted, most importantly that datasets should include a column nameddate ofPOSIX class data-and-time-stamps(DateTimeClasses).This and other conventions, such as the use ofws andwd for numeric wind speed anddirection data-series, andsite andcodefor character or factor monitoring site name andidentifier code, are now commonplace for many workingwith R in the air quality research community, and manyair quality archives provide data in (or support importfunctions that convert their own data structures to)thisopenair-friendly structure.
Source
Air quality and meteorological data packagedfor use with AQEval Examples.
Time-series sources:
date Date-and-time-stamp of POSIX class(
DateTimeClasses).no2 Nitrogen dioxide downloaded from King'sCollege London Archive using
importKCLfunction inopenair.bg.no2 Nitrogen dioxide downloaded fromthe Automatic Urban and Rural Network Archive using
importAURNfunction inopenair.ws,wd,air_temp Windspeed, wind direction and air temperature downloaded fromNOAA's Integrated Surface Database using
importNOAAfunction inworldmet.
References
Regardingopenair andopenair-friendlydata structuring, see:
Carslaw, D. C. and K. Ropkins (2012), openair — anR package for air quality data analysis.Environmental Modelling & Software. Volume 27-28,52-61,DOIdoi:10.1016/j.envsoft.2011.09.008
Ropkins, K. and D.C. Carslaw (2012), openair-DataAnalysis Tools for the Air Quality Community. R Journal,4(1).URLhttps://journal.r-project.org/archive/2012/RJ-2012-003/RJ-2012-003.pdf
Regardingworldmet, see:
David Carslaw (2021), worldmet: Import SurfaceMeteorological Data from NOAA Integrated SurfaceDatabase (ISD). R package version 0.9.5.URLhttps://CRAN.R-project.org/package=worldmet
See Also
openair: functionsimportAURN andimportKCL
worldmet: functionimportNOAA (See References)
Examples
#data set used in AQEval Examplesdim(aq.data)head(aq.data)with(aq.data, plot(date, no2, type="l"))aqeval.generics
Description
Generic functions for use withaqe object class forAQEval outputs.
Usage
## S3 method for class 'aqe'print(x, ...)## S3 method for class 'aqe'plot(x, ...)Arguments
x | the |
... | additional arguments, typically passed on to next method orignored. |
Some functions to calculate statistics
Description
Calculate data set statistics forselected time intervals.
Usage
calcDateRangeStat( data, from = NULL, to = NULL, stat = NULL, pollutant = NULL, ..., method = 2)calcRollingDateRangeStat( data, range = "year", res = "day", stat = NULL, pollutant = NULL, from = NULL, to = NULL, ..., method = 2)Arguments
data | (data.frame, tibble, etc) Data set containingdata statistic to be calculated for, and |
from | (various) Start date(s) to subsample from whencalculating statistic, by default end of supplied |
to | (various) End date(s) to subsample to whencalculating statistic, by default end of supplied |
stat | (function) Statistic to be applied to selecteddata, by default |
pollutant | (character) The name(s) of data-series toanalyse in |
... | extra arguments. |
method | (numeric) Method to use when calculatingstatistic. Currently 1 (using base R), 2 (using dplyr),3 (using data.table), and 4 (using dplyr and purrr) |
range | (character) For |
res | (character) For |
Value
These functions returndata.frames of functionoutputs.
Note
These functions are in development and likely to changesignificantly in future versions, please handle withcare.
find and test break-points
Description
Finding and testing break-points inconventionally formatted air quality data sets.
Usage
findBreakPoints(data, pollutant, h = 0.15, ...)testBreakPoints(data, pollutant, breaks, ...)Arguments
data | Data source, typically a |
pollutant | Name of time-series, assumed to bea column in |
h | ( |
... | other parameters |
breaks | ( |
Details
findBreakPoints uses methods fromstrucchange package (see references) andmodifications as suggested by the main author ofstrucchange to handle missing cases to findpotential breaks-points in a supplied time-series.
testBreakPoints tests and identifies most likelybreak-points using methods proposed for use withquantBreakPoints andquantBreakSegmentsand conventionally formatted air quality data sets.
Value
findBreakPoints returns adata.frameof found break-points.
testBreakPoints return a likely break-point/segmentreport.
References
Regardingstrucchange methods seebreakpoints, and:
Achim Zeileis, Friedrich Leisch, Kurt Hornik and Christian Kleiber(2002). strucchange: An R Package for Testing for Structural Changein Linear Regression Models. Journal of Statistical Software, 7(2),1-38. URLhttps://www.jstatsoft.org/v07/i02/.
Achim Zeileis, Christian Kleiber, Walter Kraemer and Kurt Hornik(2003). Testing and Dating of Structural Changes in Practice.Computational Statistics & Data Analysis, 44, 109-123.
Regarding missing data handling, see:
URL:https://stackoverflow.com/questions/43243548/strucchange-not-reporting-breakdates.
RegardingtestBreakPoints, see:
Ropkins, K., Walker, A., Philips, I., Rushton, C., Clark, T. andTate, J., Change Detection of Air Quality Time-Series Using theR Package AEQval. Available at SSRN 4267722.https://ssrn.com/abstract=4267722 or http://dx.doi.org/10.2139/ssrn.4267722Also at: https://karlropkins.github.io/AQEval/articles/AQEval_Intro_Preprint.pdf
See Also
find nearby sites
Description
Function to find nearest locations in areference by latitude and longitude.
Usage
findNearLatLon(lat, lon = NULL, nmax = 10, ..., ref = NULL, units = "m")findNearSites( lat, lon, pollutant = "no2", site.type = "rural background", nmax = 10, ..., ref = NULL, units = "m")Arguments
lat,lon | (numeric) The supplied latitude andlongitude. |
nmax | (numeric) The maximum number of nearest sitesto report, by default 10. |
... | Other parameters, mostly ignored. |
ref | ( |
units | (character) The units to use when reportingdistances to near locations; current options m. |
pollutant | (character) For |
site.type | (character) For |
Details
If investigating air quality in a particular location,for example a UK Clean Air Zone(https://www.gov.uk/guidance/driving-in-a-clean-air-zone),you may wish to locate an appropriate rural background air qualitymonitoring station.findNearSites locates air quality monitoringsites with openly available data such as that available from the UK AURNnetwork (https://uk-air.defra.gov.uk/networks/network-info?view=aurn)
Value
find.near returnsdata.frame of near site metadata.
Note
This function uses haversine formula to accountto the Earth's surface curvature, and uses 6371 km asthe radius of earth.
Examples
#find rural background NO2 monitoring sites#near latitude = 50, longitude = -1#not run: requires internet## Not run: findNearSites(lat = 50, lon = -1)## End(Not run)isolateContribution
Description
Environmental time-series signal processing:Contribution isolation based on background subtraction,deseasonalisation and/or deweathering.
Usage
isolateContribution( data, pollutant, background = NULL, deseason = TRUE, deweather = TRUE, method = 2, add.term = NULL, formula = NULL, use.bam = FALSE, output = "mean", ...)Arguments
data | Data source, typically |
pollutant | The column name of the |
background | (optional) if supplied, the backgroundtime-series to use as a background correction.See below. |
deseason | logical or character vector, if |
deweather | logical or character vector, if |
method | numeric, contribution isolation method(default 2). See Note. |
add.term | extra terms to add to the contributionisolation model; ignore for now (in development). |
formula | (optional) Signal isolate model formula;this allows user to set the signal isolation model formuladirectly, but means function arguments |
use.bam | (logical) If TRUE, the |
output | output options; currently, |
... | other arguments; ignore for now (in development) |
Details
isolateContribution estimates andsubtractspollutant variance associated withfactors that may hinder break-point/segment analysis:
Background Correction If applied, this fitsthe supplied
backgroundtime-series as aspline term:s(background).Seasonality If applied, this fits regularfrequency terms, e.g.
day.hour,year.day,as spline terms, default TRUE is equivalent tos(day.hour)ands(year.day). All terms arecalculated fromdatecolumn indata.Weather If applied, this fits time-series ofidentified meteorological measurements, e.g. wind speedand direction (
wsandwdindata).If bothwsandwdare present these arefitted as a tensor termte(ws, wd). Otherdeweathering terms, if included, are fittedas spline terms(term). The defaultTRUEis equivalent tote(ws, wd).
Using the supplied arguments, it builds a signal(mgcv) GAM model, calculates,and returns the mean-centred residuals as anestimate of the isolated local contribution.
Value
isolateContribution returns a vector ofpredictions of thepollutant time-series afterthe requested signal isolation.
Note
method was included as part of methoddevelopment and testing work, and retained for now.Please ignore for now.
Author(s)
Karl Ropkins
References
Regardingmgcv GAM fitting methods, seeWood (2017) for general introduction and packagedocumentation regarding coding (mgcv):
Wood, S.N. (2017) Generalized Additive Models:an introduction with R (2nd edition), Chapman and Hall/CRC.
RegardingisolateContribution, see:
Ropkins, K., Walker, A., Philips, I., Rushton, C., Clark, T. andTate, J., Change Detection of Air Quality Time-Series Using theR Package AEQval. Available at SSRN 4267722.https://ssrn.com/abstract=4267722 or http://dx.doi.org/10.2139/ssrn.4267722Also at: https://karlropkins.github.io/AQEval/articles/AQEval_Intro_Preprint.pdf
See Also
Regarding seasonal terms and frequencyanalysis, see alsostl andspectralFrequency.
Examples
#fitting a simple deseasonalisation, deweathering#and background correction (dswb) model to no2:aq.data$dswb.no2 <- isolateContribution(aq.data, "no2", background="bg.no2")#compare at 14 day resolution:temp <- openair::timeAverage(aq.data, "14 day")#without dswbquantBreakPoints(temp, "no2", test=FALSE, h=0.1)#with dswbquantBreakPoints(temp, "dswb.no2", test=FALSE, h=0.1)Other Air Quality Models
Description
Other packaged Air Quality Models.
Usage
fitNearSiteModel(data, pollutant = "no2", y, x = "rest", elements = NULL, ...)Arguments
data |
|
pollutant | The name of the |
y | The name of the monitor site to be modelled,assumed to be one several names in the |
x | The other sites to use when building the model, thedefault 'rest' uses all supplied sites except 'y'. |
elements | The number of inputs to use in thesite models, can be any number up to length of x orcombination thereof; by default this is set as |
... | extra arguments. |
Details
fitNearSiteModel builds an air qualitymodel for one location using air quality data from nearbysites.
Value
data with model output added as additionalcolumn.
quantify break-point/segments
Description
Quantify either break-points orbreak-segment methods for pollutant time-series
Usage
quantBreakPoints( data, pollutant, breaks, ylab = NULL, xlab = NULL, pt.col = c("lightgrey", "darkgrey"), line.col = "red", break.col = "blue", event = NULL, show = c("plot", "report"), ...)quantBreakSegments( data, pollutant, breaks, ylab = NULL, xlab = NULL, pt.col = c("lightgrey", "darkgrey"), line.col = "red", break.col = "blue", event = NULL, seg.method = 2, seg.seed = 12345, show = c("plot", "report"), ...)Arguments
data | Data source, typically a data.frame or similar,containing data-series to model and a paired time-stampdata-series, named date. |
pollutant | The name of the data-series tobreak-point or break-segment model. |
breaks | (Optional) The break-points andconfidence intervals to use when building eitherbreak-point or break-segment models. If not suppliedthese are build using |
ylab | Y-label term, by default pollutant. |
xlab | X-label term, by default date. |
pt.col | Point fill and line colours for plot,defaults lightgrey and darkgrey. |
line.col | Line colour for plot, default red. |
break.col | Break-point/segment colour for plot, defaultblue. |
event | An optional list of plot terms for an eventmarker, applied to a vertical line and text label. Listitems include: |
show | What to show before returning the break-pointquantification mode, by default plot and report. |
... | other parameters |
seg.method | ( |
seg.seed | ( |
Details
quantBreakPoints andquantBreakSegments both usestrucchange methods to identify potentialbreak-points in time-series, and then quantifythese as conventional break-points or break-segments,respectively:
Finding Break-points Using the
strucchangemethods of Zeileis and colleaguesand independent change detection model, the functionsapply a rolling-window approach, assuming the firstwindow (or data subset) is without change, building astatistical model of that, advancing the window,building a second model and comparing these, and so on,to identify the most likely points of change in alarger data-series. See alsofindBreakPointsQuantifying Break-points Using thesupplied break-points to build a break-point model.
Quantifying Break-segments Using theconfidence regions for the supplied break-points as thestarting points to build a break-segment model.
Value
Both functions use theshow argumentto control which elements of the functions outputsare shown but also invisible return alistof all outputs which can caught using, e.g.:
brk.mod <- quantBreakPoints(data, pollutant)
Note
AQEval functionquantBreakSegmentsis currently runningsegmented v.1.3-4 while weevaluate latest version,v.1.4-0.
Author(s)
Karl Ropkins
References
Regardingstrucchange methods see in-packagedocumentation, e.g.breakpoints,and:
Achim Zeileis, Friedrich Leisch, Kurt Hornik and Christian Kleiber(2002). strucchange: An R Package for Testing for Structural Changein Linear Regression Models. Journal of Statistical Software, 7(2),1-38. URLhttps://www.jstatsoft.org/v07/i02/.
Achim Zeileis, Christian Kleiber, Walter Kraemer and Kurt Hornik(2003). Testing and Dating of Structural Changes in Practice.Computational Statistics & Data Analysis, 44, 109-123.DOIdoi:10.1016/S0167-9473(03)00030-6.
Regardingsegmented methods see in-packagedocumentation, e.g.segmented, and:
Vito M. R. Muggeo (2003). Estimating regression modelswith unknown break-points. Statistics in Medicine, 22,3055-3071. DOI 10.1002/sim.1545.
Vito M. R. Muggeo (2008). segmented: an R Package toFit Regression Models with Broken-Line Relationships.R News, 8/1, 20-25.URLhttps://cran.r-project.org/doc/Rnews/.
Vito M. R. Muggeo (2016). Testing with a nuisanceparameter present only under the alternative: ascore-based approach with application to segmentedmodelling. J of Statistical Computation and Simulation,86, 3059-3067.DOI 10.1080/00949655.2016.1149855.
Vito M. R. Muggeo (2017). Interval estimation for thebreakpoint in segmented regression: a smoothedscore-based approach. Australian & New Zealand Journalof Statistics, 59, 311-322.DOI 10.1111/anzs.12200.
Regarding break-points/segment methods, see:
Ropkins, K., Walker, A., Philips, I., Rushton, C., Clark, T. andTate, J., Change Detection of Air Quality Time-Series Using theR Package AEQval. Available at SSRN 4267722.https://ssrn.com/abstract=4267722 or http://dx.doi.org/10.2139/ssrn.4267722Also at: https://karlropkins.github.io/AQEval/articles/AQEval_Intro_Preprint.pdf
See Also
timeAverage inopenair,breakpoints instrucchange, andsegmented insegmented.
Examples
#using openair timeAverage to covert 1-hour data to 1-day averagestemp <- openair::timeAverage(aq.data, "1 day")#break-pointsquantBreakPoints(temp, "no2", h=0.3)#break-segmentsquantBreakSegments(temp, "no2", h=0.3)#addition examples (not run)## Not run: #in-call plot modification#removing x axis label#recolouring break line and#adding an event markerquantBreakPoints(temp, "no2", h=0.3, xlab="", break.col = "red", event=list(label="Event expected here", x="2002-08-01", col="grey"))## End(Not run)Spectral Analysis
Description
Time-series spectral frequency analysis.
Usage
spectralFrequency(data, pollutant, ...)Arguments
data |
|
pollutant | The name of the time-series,typically pollutant measurements, to be analysed. |
... | extra arguments. |
Details
spectralFrequency producing atime frequency analysis of the requestedpollutant.
Value
spectralFrequency uses theshowargument to control which elements of the functions outputsare shown but also invisibly returns alistof all outputs which can caught using, e.g.:
sfa.mod <- spectralFrequency(data, pollutant)
Examples
spectralFrequency(aq.data, "no2")