| Type: | Package |
| Title: | Exploring Water Quality Monitoring Data |
| Encoding: | UTF-8 |
| Version: | 1.0.3 |
| Maintainer: | Jemma Stachelek <jemma.stachelek@gmail.com> |
| URL: | https://github.com/jsta/wql |
| BugReports: | https://github.com/jsta/wql/issues |
| Description: | Functions to assist in the processing and exploration of data from environmental monitoring programs. The package name stands for "water quality" and reflects the original focus on time series data for physical and chemical properties of water, as well as the biota. Intended for programs that sample approximately monthly, quarterly or annually at discrete stations, a feature of many legacy data sets. Most of the functions should be useful for analysis of similar-frequency time series regardless of the subject matter. |
| Depends: | R (≥ 3.0.0) |
| Imports: | graphics, grDevices, methods, stats, ggplot2 (≥ 1.0),reshape2, zoo |
| LazyData: | yes |
| License: | GPL-2 |
| VignetteBuilder: | knitr |
| NeedsCompilation: | no |
| RoxygenNote: | 7.3.2 |
| Suggests: | knitr, rmarkdown |
| Packaged: | 2025-09-02 12:32:30 UTC; jsta |
| Author: | Alan Jassby [aut], James Cloern [ctb], Jemma Stachelek [ctb, cre] |
| Repository: | CRAN |
| Date/Publication: | 2025-09-02 13:10:02 UTC |
The main purpose ofwql is to explore seasonal time series throughplots and nonparametric trend tests. It was created originally to examinewater quality data sets (hence, “wql”) but is suitable as a moregeneral purpose set of tools for looking at annual or seasonal time series.
Description
One of the more tedious tasks in exploring environmental data sets iscreating usable time series from the original complex data sets, especiallywhen you want many series at will that group data in different ways. Sowql also provides a way of transforming data sets to a common formatthat then allows a diversity of time series to be created quickly. A fewfunctions are specific to the fields of limnology and oceanography.
Details
The plots are designed for easy use, not for publication-quality graphs.Nonetheless, extensive customization is possible by passing options through...{}, adding annotations in the case of base graphics, and addinglayers in the case ofggplot2 objects.
Two functions are used mainly for preparing the times series:
a function that transforms incoming data to a common datastructure in the form of the
WqDataclassa function thateasily prepares time series objects from this class
TheWqData class can be easily adapted to non-aquatic data.Obviously, thedepth field can be used for elevation in atmosphericstudies. But more generally, thesite anddepth fields can beused for many two-way classifications and don't need to refer to spatiallocation.
Some of the time series functions include:
a variety of plots to examine changes in seasonal patterns
nonparametric trend tests
time series interpolation and relatedmanipulations
a simple decomposition of a series into different timescales
phenological analyses
the use of empirical orthogonalfunctions to detect multiple independent mechanisms underlying temporalchange
A few functions are specialized for the aquatic sciences:
converting between oxygen concentrations and percentsaturation
converting between salinity and conductivity
The capabilities ofwql are more fully explained in the accompanyingvignette: “wql: Exploring environmental monitoring data”.
Author(s)
Maintainer: Jemma Stachelekjemma.stachelek@gmail.com [contributor]
Authors:
Alan Jassby
Other contributors:
James Cloern [contributor]
See Also
Useful links:
Class "DateTime"
Description
A class union of"Date" and"POSIXct" classes.
Objects from the Class
A virtual Class: No objects may be createdfrom it.
See Also
Examples
showClass("DateTime")R2pss
Description
R2pss
Usage
R2pss(R, t, p = 0)Arguments
R | conductivity ratio, dimensionless |
t | temperature, Celsius |
p | gauge pressure, decibar |
Author(s)
Alan Jassby, James Cloern
Class WqData
Description
A simple extension or subclass of the"data.frame" class for typical“discrete” water quality monitoring programs that examine phenomenaon a time scale of days or longer. It requires water quality data to be in aspecific “long” format, although a generating functionwqData can be used for different forms of data.
Objects from the Class
Objects can be created by calls of the formnew("WqData", d), whered is a data.frame.d shouldhave columns namedtime, site, depth, variable, value of class"DateTime", "factor", "numeric", "factor", "numeric", respectively.
See Also
DateTime-class,tsMake,WqData-method,wqData
Examples
showClass("WqData")# Construct the WqData object sfb as shown in the wqData examples.sfb <- wqData(sfbay, c(1, 3, 4), 5:12, site.order = TRUE, type = "wide", time.format = "%m/%d/%Y")# Summarize the datasummary(sfb)# Create boxplot summary of dataplot(sfb, vars = c("chl", "dox", "spm"), num.col = 2)# Extract some of the data as a WqData objectsfb[1:10, ] # first 10 observationssfb[sfb$depth == 20, ] # all observations at 20 mdate2decyear
Description
date2decyear
Usage
date2decyear(w)Arguments
w | date |
Author(s)
Alan Jassby, James Cloern
Decompose a time series
Description
The function decomposes a time series into a long-term mean, annual,seasonal and "events" component. The decomposition can be multiplicative oradditive, and based on median or mean centering.
Usage
decompTs( x, event = TRUE, type = c("mult", "add"), center = c("median", "mean"))Arguments
x | a monthly time series vector |
event | whether or not an "events" component should be determined |
type | the type of decomposition, either multiplicative ("mult") oradditive ("add") |
center | the method of centering, either median or mean |
Details
The rationale for this simple approach to decomposing a time series, withexamples of its application, is given by Cloern and Jassby (2010). It ismotivated by the observation that many important events for estuaries (e.g.,persistent dry periods, species invasions) start or stop suddenly. Smoothingto extract the annualized term, which can disguise the timing of theseevents and make analysis of them unnecessarily difficult, is not used.
A multiplicative decomposition will typically be useful for a biologicalcommunity- or population-related variable (e.g., chlorophyll-a) thatexperiences exponential changes in time and is approximately lognormal,whereas an additive decomposition is more suitable for a normal variable.The default centering method is the median, especially appropriate forseries that have large, infrequent events.
Ifevent = TRUE, the seasonal component represents a recurringmonthly pattern and the events component a residual series. Otherwise, theseasonal component becomes the residual series. The latter is appropriatewhen seasonal patterns change systematically over time. You can useplotSeason andseasonTrend to investigate theway seasonality changes.
Value
A monthly time series matrix with the following individual timeseries:
original | original time series |
annual | annual meanseries |
seasonal | repeating seasonal component |
events | optionally, the residual or "events" series |
Author(s)
Alan Jassby, James Cloern
References
Cloern, J.E. and Jassby, A.D. (2010) Patterns and scales ofphytoplankton variability in estuarine-coastal ecosystems.Estuariesand Coasts33, 230–241.
See Also
Examples
# Apply the function to a single series (Station 27) and plot it:y <- decompTs(sfbayChla[, 's27'])yplot(y, nc=1, main="")decyear2date
Description
decyear2date
Usage
decyear2date(x)Arguments
x | date |
Author(s)
Alan Jassby, James Cloern
Convert conductivity to salinity
Description
Electrical conductivity data are converted to salinity using the PracticalSalinity Scale and an extension for salinities below 2.
Usage
ec2pss(ec, t, p = 0)Arguments
ec | conductivity, mS/cm |
t | temperature, Celsius |
p | gauge pressure, decibar |
Details
ec2pss converts electrical conductivity data to salinity using thePractical Salinity Scale 1978 in the range of 2-42 (Fofonoff and Millard1983). Salinities below 2 are calculated using the extension of thePractical Salinity Scale (Hill et al. 1986).
R2pss is the same function, except that conductivity ratios ratherthan conductivities are used as input.
Value
ec2pss andR2pss both return salinity values on thePractical Salinity Scale.
Note
Input pressures are not absolute pressures but rather gauge pressures.Gauge pressures are measured relative to 1 standard atmosphere, so the gaugepressure at the surface is 0.
Author(s)
Alan Jassby, James Cloern
References
Fofonoff N.P. and Millard Jr R.C. (1983)Algorithms forComputation of Fundamental Properties of Seawater. UNESCO Technical Papersin Marine Science 44. UNESCO, Paris, 53 p.
Hill K.D., Dauphinee T.M. and Woods D.J. (1986) The extension of thePractical Salinity Scale 1978 to low salinities.IEEE Journal ofOceanic Engineering11, 109-112.
Examples
# Check values from Fofonoff and Millard (1983):R = c(1, 1.2, 0.65) t = c(15, 20, 5)p = c(0, 2000, 1500)R2pss(R, t, p) # 35.000 37.246 27.995# Repeat calculation with equivalent conductivity values by setting # ec <- R * C(35, 15, 0):ec = c(1, 1.2, 0.65) * 42.9140ec2pss(ec, t, p) # same resultsEmpirical orthogonal function analysis
Description
Finds and rotates empirical orthogonal functions (EOFs).
Usage
eof(x, n, scale. = TRUE)Arguments
x | a data frame or matrix, with no missing values |
n | number of EOFs to retain for rotation |
scale. | logical indicating whether the (centered) variables should bescaled to have unit variance |
Details
EOF analysis is used to study patterns of variability (“modes”) in amatrix time series and how these patterns change with time(“amplitude time series”). Hannachi et al. (2007) give a detaileddiscussion of this exploratory approach with emphasis on meteorologicaldata. In oceanography and climatology, the time series representobservations at different spatial locations (columns) over time (rows). Butcolumns can also be seasons of the year (Jassby et al. 1999) or even acombination of seasons and depth layers (Jassby et al. 1990). EOF analysisuses the same techniques as principal component analysis, but the timeseries are observations of the same variable in the same units. Scaling thedata is optional, but it is the default here.
Eigenvectors (unscaled EOFs) and corresponding eigenvalues (amount ofexplained variance) are found by singular value decomposition of thecentered and (optionally) scaled data matrix usingprcomp. Inorder to facilitate a physical interpretation of the variability modes, asubset consisting of then most important EOFs is rotated (Richman1986).eofNum can be used to help choosen. Hannachi etal. (2007) recommend orthogonal rotation of EOFs scaled by the square rootof the corresponding eigenvalues to avoid possible computation problems andreduce sensitivity to the choice ofn. We follow this recommendationhere, using thevarimax method for the orthogonal rotation.
Note that the signs of the EOFs are arbitrary.
Value
A list with the following members:
REOF | a matrix with rotatedEOFs |
amplitude | a matrix with amplitude time series ofREOFs |
eigen.pct | all eigenvalues of correlation matrix aspercent of total variance |
variance | variance explained by retainedEOFs |
Author(s)
Alan Jassby, James Cloern
References
Hannachi, A., Jolliffe, I.T., and Stephenson, D.B. (2007)Empirical orthogonal functions and related techniques in atmosphericscience: A review.International Journal of Climatology27,1119–1152.
Jassby, A.D., Powell, T.M., and Goldman, C.R. (1990) Interannualfluctuations in primary production: Direct physical effects and the trophiccascade at Castle Lake, California (USA).Limnology and Oceanography35, 1021–1038.
Jassby, A.D., Goldman, C.R., Reuter, J.E., and Richards, R.C. (1999) Originsand scale dependence of temporal variability in the transparency of LakeTahoe, California-Nevada.Limnology and Oceanography44,282–294.
Richman, M. (1986) Rotation of principal components.Journal ofClimatology6, 293–335.
See Also
Examples
# Create an annual matrix time serieschla1 <- aggregate(sfbayChla, 1, mean, na.rm = TRUE)chla1 <- chla1[, 1:12] # remove stations with missing years# eofNum (see examples) suggests n = 1eof(chla1, 1)Plot EOF percent variance
Description
Plots the variances associated with empirical orthogonal functions (EOF).Useful for deciding how many EOFs to retain for rotation.
Usage
eofNum(x, n = nrow(x), scale. = TRUE)Arguments
x | a data frame or matrix, with no missing values |
n | effective sample size |
scale. | logical indicating whether the (centered) variables should bescaled to have unit variance |
Details
Calculates the eigenvalues from an EOF analysis, as described ineof. The eigenvalues are plotted against eigenvalue number(sometimes called a “scree plot”), and the cumulative variance as %of total is plotted over each eigenvalue. The approximate 0.95 confidencelimits are depicted for each eigenvalue using North et al.'s (1982)rule-of-thumb, which ignores any autocorrelation in the data. If theautocorrelation structure is assessed separately and can be expressed interms of effective sample size (e.g., Thiebaux and Zwiers 1984), thenn can be set equal to this number.
There is no universal rule for deciding how many of the EOFs should beretained for rotation (Hannachi et al. 2007). In practice, the number ischosen by requiring a minimum cumulative variance, looking for a sharp breakin the spectrum, requiring that confidence limits not overlap, various MonteCarlo methods, or many other techniques. The plot produced here enables thefirst three methods.
Value
A plot of the eigenvectors.
Author(s)
Alan Jassby, James Cloern
References
Hannachi, A., Jolliffe, I.T., and Stephenson, D.B. (2007)Empirical orthogonal functions and related techniques in atmosphericscience: A review.International Journal of Climatology27,1119–1152.
North, G., Bell, T., Cahalan, R., and Moeng, F. (1982) Sampling errors inthe estimation of empirical orthogonal functions.Monthly WeatherReview110, 699–706.
Thiebaux H.J. and Zwiers F.W. (1984) The interpretation and estimation ofeffective sample sizes.Journal of Climate and Applied Meteorology23, 800–811.
See Also
Examples
# Create an annual time series data matrix from sfbay chlorophyll data# Average over each yearchla1 <- aggregate(sfbayChla, 1, mean, na.rm = TRUE) chla1 <- chla1[, 1:12] # remove stations with missing yearseofNum(chla1)# These stations appear to act as one with respect to chlorophyll# variability on the annual scale because there's one dominant EOF.Plot EOF analysis results
Description
Plots the rotated empirical orthogonal functions or amplitude time seriesresulting fromeof.
Usage
eofPlot(x, type = c("coef", "amp"), rev = FALSE, ord = FALSE)Arguments
x | result of the function |
type | whether the EOF coefficients or amplitudes should be plotted |
rev | logical indicating whether coefficients and amplitudes should bemultiplied by |
ord | logical indicating whether coefficients should be ordered by size |
Details
When the columns of the original data have a natural order, such as stationsalong a transect or months of the year, there may be no need to reorder theEOF coefficients. But if there is no natural order, such as when columnsrepresents disparate sites around the world, the plot can be moreinformative if coefficients are ordered by size (ord = TRUE).
Coefficients and amplitudes for a given EOF may be more easily interpretedifrev = TRUE, because the sign of the first coefficient isarbitrarily determined and all the other signs follow from that choice.
Value
A plot of the EOF coefficients or amplitudes.
Author(s)
Alan Jassby, James Cloern
See Also
Examples
# Create an annual matrix time serieschla1 <- aggregate(sfbayChla, 1, mean, na.rm = TRUE)chla1 <- chla1[, 1:12] # remove stations with missing years# eofNum (see examples) suggests n = 1e1 <- eof(chla1, n = 1)eofPlot(e1, type = 'coef')eofPlot(e1, type = 'amp')Interpolate or substitute missing time series values
Description
Imterpolates or substitutes missing data in a time series for gaps up to aspecified size.
Usage
interpTs( x, type = c("linear", "series.median", "series.mean", "cycle.median", "cycle.mean"), gap = NULL)Arguments
x | object of class |
type | method of interpolation or substitution |
gap | maximum gap to be replaced |
Details
Whentype = "linear", the function performs linear interpolation ofanyNA runs of length smaller than or equal togap. Whengap = NULL, gaps of any size will be replaced. Does not changeleading or trailingNA runs. This interpolation approach is best forperiods of low biological activity when sampling is routinely suspended.
Whentype = "series.median" or"series.mean", missing valuesare replaced by the overall median or mean, respectively. This may bedesirable when missing values are not allowed but one wants, for example, toavoid spurious enhancement of trends.
Whentype = "cycle.median" ortype = "cycle.mean", missingvalues are replaced by the median or mean, respectively, for the same cycleposition (i.e., same month, quarter, etc., depending on the frequency). Thismay give more realistic series than using the overall mean or median.
Intended for time series but first three types will work with any vector ormatrix. Matrices will be interpolated by column.
Value
The time series with some or all missing values replaced.
Author(s)
Alan Jassby, James Cloern
See Also
Examples
### Interpolate a vector time series and highlight the imputed datachl27 <- sfbayChla[, 's27']x1 <- interpTs(chl27, gap = 3)plot(x1, col = 'red')lines(chl27, col = 'blue')x2 <- interpTs(chl27, type = "series.median", gap = 3)plot(x2, col = 'red')lines(chl27, col = 'blue')### Interpolate a matrix time series and plot resultsx3 <- interpTs(sfbayChla, type = "cycle.mean", gap = 1)plot(x3[, 1:10], main = "SF Bay Chl-a\n(gaps of 1 month replaced)")layerMean
Description
Acts on a matrix or data frame with depth in the firstcolumn and observations for different variables (or different sites, ordifferent times) in each of the remaining columns. The trapezoidal mean overthe given depths is calculated for each of the variables. Replicate depthsare averaged, and missing values or data with only one unique depth arehandled. Data are not extrapolated to cover missing values at the top orbottom of the layer. The result can differ markedly from the simple meaneven for equal spacing of depths, because the top and bottom values areweighted by 0.5 in a trapezoidal mean.
Usage
layerMean(d)Arguments
d | data.frame |
Author(s)
Alan Jassby, James Cloern
leapYear
Description
TRUE ifx is a leap year,FALSEotherwise.
Usage
leapYear(x)Arguments
x | integer year |
Author(s)
Alan Jassby, James Cloern
Mann-Kendall trend test and the Sen slope
Description
Applies Kendall's tau test for the significance of a monotonic time seriestrend (Mann 1945). Also calculates the Sen slope as an estimate of thistrend.
Usage
mannKen( x, plot = FALSE, type = c("slope", "relative"), order = FALSE, pval = 0.05, pchs = c(19, 21), ...)Arguments
x | A numeric vector, matrix or data frame |
plot | Should the trends be plotted when x is a matrix or data frame? |
type | Type of trend to be plotted, actual or relative |
order | Should the plotted trends be ordered by size? |
pval | p-value for significance |
pchs | Plot symbols for significant and not significant trendestimates, respectively |
... | Other arguments to pass to plotting function |
Details
The Sen slope (alternately, Theil or Theil-Sen slope)—the median slopejoining all pairs of observations—is expressed by quantity per unit time.The fraction of missing slopes involving the first and last fifths of thedata are provided so that the appropriateness of the slope estimate can beassessed and results flagged. Schertz et al. [1991] discuss this and relateddecisions about missing data. Other results are used for further analysis byother functions. Serial correlation is ignored, so the interval betweenpoints should be long enough to avoid strong serial correlation.
For the relative slope, the slope joining each pair of observations isdivided by the first of the pair before the overall median is taken. Therelative slope makes sense only as long as the measurement scale isnon-negative (not, e.g., temperature on the Celsius scale). Comparingrelative slopes is useful when the variables inx have differentunits.
Ifplot = TRUE, then either the Sen slope (type = "slope") orthe relative Sen slope (type = "relative") are plotted. The plotsymbols indicate, respectively, that the trend is significant or notsignificant. The plot can be customized by passing any arguments used bydotchart such asxlab orxlim, as well asgraphical parameters described inpar.
Value
A list of the following ifx is a vector:
sen.slope | Sen slope |
sen.slope.rel | Relative Sen slope |
p.value | Significance of slope |
S | Kendall's S |
varS | Variance of S |
miss | Fraction of missing slopes connectingfirst and last fifths of |
or a matrix with corresponding columns ifx is a matrix or data frame.
Note
Approximate p-values with corrections for ties and continuity are usedifn > 10 or if there are any ties. Otherwise, exact p-values based onTable B8 of Helsel and Hirsch (2002) are used. In the latter case,p =0.0001 should be interpreted asp < 0.0002.
Author(s)
Alan Jassby, James Cloern
References
Mann, H.B. (1945) Nonparametric tests against trend.Econometrica13, 245–259.
Helsel, D.R. and Hirsch, R.M. (2002)Statistical methods in waterresources. Techniques of Water Resources Investigations, Book 4, chapterA3. U.S. Geological Survey. 522 pages.doi:10.3133/twri04A3
Schertz, T.L., Alexander, R.B., and Ohe, D.J. (1991)The computerprogram EStimate TREND (ESTREND), a system for the detection of trends inwater-quality data. Water-Resources Investigations Report 91-4040, U.S.Geological Survey.
See Also
Examples
tsp(Nile) # an annual time seriesmannKen(Nile)y <- sfbayChlay1 <- interpTs(y, gap=1) # interpolate single-month gaps onlyy2 <- aggregate(y1, 1, mean, na.rm=FALSE)mannKen(y2)mannKen(y2, plot=TRUE) # missing data means missing trend estimatesmannKen(y2, plot=TRUE, xlim = c(0.1, 0.25))mannKen(y2, plot=TRUE, type='relative', order = TRUE, pval = .001, xlab = "Relative trend")legend("topleft", legend = "p < 0.001", pch = 19, bty="n")meanSub
Description
meanSub
Usage
meanSub(x, sub, na.rm = FALSE)Arguments
x | numeric vector |
sub | integer index |
na.rm | logical |
Author(s)
Alan Jassby, James Cloern
monthCor
Description
monthCor
Usage
monthCor(x)Arguments
x | ts |
Author(s)
Alan Jassby, James Cloern
monthNum
Description
Converts dates to the corresponding numeric month.
Usage
monthNum(y)Arguments
y | date |
Author(s)
Alan Jassby, James Cloern
Converts matrix to vector time series for various analyses
Description
First aggregates multivariate matrix time series by year. Then converts to avector time series in which “seasons” correspond to these annualizedvalues for the original variables.
Usage
mts2ts(x, seas = 1:frequency(x), na.rm = FALSE)Arguments
x | An object of class "mts" |
seas | Numeric vector of seasons to aggregate in original time series. |
na.rm | Should missing data be ignored when aggregating? |
Details
Theseas parameter enables focusing the subsequent analysis onseasons of special interest, or to ignore seasons where there are too manymissing data. The function can be used in conjunction withseaKen toconduct a Regional Kendall trend analysis. Sometimes just plotting theresulting function can be useful for exploring a spatial transect over time.
Value
A vector time series
Author(s)
Alan Jassby, James Cloern
See Also
Examples
## Quick plot a spatial transect of chlorophyll a during the ## spring bloom period (Feb-Apr) for each year.y <- mts2ts(sfbayChla, seas = 2:4)plot(y, type = 'n')abline(v = 1978:2010, col = 'lightgrey')lines(y, type = 'h')Dissolved oxygen at saturation
Description
Finds dissolved oxygen concentration in equilibrium with water-saturatedair.
Usage
oxySol(t, S, P = NULL)Arguments
t | temperature, degrees C |
S | salinity, on the Practical Salinity Scale |
P | pressure, atm |
Details
Calculations are based on the approach of Benson and Krause (1984), usingGreen and Carritt's (1967) equation for dependence of water vapor partialpressure ont andS. Equations are valid for temperature inthe range 0-40 C and salinity in the range 0-40.
Value
Dissolved oxygen concentration in mg/L at 100% saturation. IfP = NULL, saturation values at 1 atm are calculated.
Author(s)
Alan Jassby, James Cloern
References
Benson, B.B. and Krause, D. (1984) The concentration andisotopic fractionation of oxygen dissolved in fresh-water and seawater inequilibrium with the atmosphere.Limnology and Oceanography29, 620-632.
Green, E.J. and Carritt, D.E. (1967) New tables for oxygen saturation ofseawater.Journal of Marine Research25, 140-147.
Examples
# Convert DO into % saturation for 1-m depth at Station 32.# Use convention of expressing saturation at 1 atm.sfb1 <- subset(sfbay, depth == 1 & stn == 32)dox.pct <- with(sfb1, 100 * dox/oxySol(temp, sal))summary(dox.pct)Nonparametric Change-Point Detection
Description
Locates a single change-point in an annual series based on the Pettitt test.
Usage
pett(x, plot = FALSE, order = FALSE, pval = 0.05, pchs = c(19, 21), ...)Arguments
x | a numeric vector, matrix or data frame with no missing interiorvalues |
plot | Should the trends be plotted when x is a matrix? |
order | Should the plotted trends be ordered by size? |
pval | p-value for significance |
pchs | Plot symbols for significant and not significant trendestimates, respectively |
... | Other arguments to pass to plotting function |
Details
Pettitt's (1979) method is a rank-based nonparametric test for abruptchanges in a time series. It uses the Mann-Whitney statistic for testingthat two samples (before and after the change-point) come from the samedistribution, choosing the change-point that maximizes the statistic. Thep-value is approximate but accurate to 0.01 forp \le 0.5.Serial correlation is ignored, so the interval between points should be longenough to avoid strong serial correlation. The size of the change isestimated as the median difference between all pairs of observations inwhich the first one is after the change-point and the second is up to thechange-point.
Missing values are allowed at the beginning or end of each variable butinterior missing values will produce an NA. Otherwise the change-point mightnot be meaningful.
Ifplot = TRUE, a dot plot ofchange.times is shown. Ifsort = TRUE, the dots are sorted bychange.time. The plotsymbols indicate, respectively, that the trend is significant or notsignificant. The plot can be customized by passing any arguments used bydotchart such asxlab, as well as graphical parametersdescribed inpar.
Value
A list of the following ifx is a vector:
pettitt.K | Pettitt's statistic |
p.value | significanceprobability for statistic |
change.point | last position precedingchange to new level |
change.time | if available, time of change.pointposition |
change.size | median of all differences between points afterand up to change.point |
or a matrix with corresponding columns ifxis a matrix or data frame.
Note
Thechange.point returned by these functions is the lastposition before the series actually changes, for consistency with theoriginal Pettitt test. But for reporting purposes, the following positionmight be more appropriate to call the “change-point”.
The Pettitt test produces a supposed change-point, even when the trend issmooth, or when the abrupt change is smaller than the long-term smoothchange. Remove any smooth, long-term trend before applying this test.
Author(s)
Alan Jassby, James Cloern
References
Pettitt, A. N. (1979) A non-parametric approach to thechange-point problem.Journal of the Royal Statistical Society. SeriesC (Applied Statistics)28(2), 126–135.
Examples
# data from Pettitt (1979, Table 1):y <- c(-1.05, 0.96, 1.22, 0.58, -0.98, -0.03, -1.54, -0.71, -0.35, 0.66, 0.44, 0.91, -0.02, -1.42, 1.26, -1.02, -0.81, 1.66, 1.05, 0.97, 2.14, 1.22, -0.24, 1.60, 0.72, -0.12, 0.44, 0.03, 0.66, 0.56, 1.37, 1.66, 0.10, 0.80, 1.29, 0.49, -0.07, 1.18, 3.29, 1.84)pett(y) # K=232, p=0.0146, change-point=17, the same results as Pettitt# identify the year of a change-point in an annual time series:pett(Nile)# apply to a matrix time series:y <- ts.intersect(Nile, LakeHuron)pett(y)pett(y, plot = TRUE, xlab = "Change-point")legend("topleft", legend = "p < 0.05", pch = 19, bty="n")# note how a smooth trend can disguise a change-point:# smooth trend with change-point at 75y <- 1:100 + c(rep(0, 75), rep(10, 25)) pett(y) # gives 50, erroneouslypett(residuals(lm(y~I(1:100)))) # removing trend gives 75, correctlyPhenological amplitude
Description
Finds various measures of the amplitude of the annual cycle, or of somespecified season range.
Arguments
x | A seasonal time series, or a class |
season.range | A vector of two numbers specifying the season range tobe considered. |
Details
phenoAmp gives three measures of the amplitude of a seasonal cycle:the range, the variance, and the median absolute deviation, along with themean and median to allow calculation of other statistics as well.
These measures can be restricted to a subset of the year by giving thedesired range of season numbers. This can be useful for isolating measuresof, say, the spring and autumn phytoplankton blooms in temperate waters. Inthe case of a monthly time series, for example, a non-missing value isrequired for every month or the result will beNA, so using a periodshorter than one year can also help avoid any months that are typically notcovered by the sampling program. Similarly, in the case of datedobservations, a shorter period can help avoid times of sparse data. Themethod for time series allows for other than monthly frequencies, butseason.range is always interpreted as months forzoo objects.
Note that the amplitude is sensitive to the number of samples for smallnumbers. This could be a problem forzoo objects if the sample numberis changing greatly from year to year, depending on the amplitude measureand the underlying data distribution. So usets objects or make surethat the sample number stays more or less the same over time.
tsMake can be used to producets andzoo objectssuitable as arguments to this function.
Value
A matrix of classts orzoo with individual series forthe range, variance, median absolute deviation, mean, median and – in thecase ofzoo objects – number of samples.
References
Cloern, J.E. and Jassby, A.D. (2008) Complex seasonal patternsof primary producers at the land-sea interface.Ecology Letters11, 1294–1303.
See Also
Examples
y <- sfbayChla[, "s27"]phenoAmp(y) # entire year# i.e., Jan-Jun only, which yields results for more yearsphenoAmp(y, c(1, 6))Methods for Function phenoAmp
Description
Finds various measures of the amplitude of the annual cycle.
Methods
- list("signature(x = \"ts\")")
- list("signature(x = \"zoo\")")
Phenological phase
Description
Finds various measures of the phase of the annual cycle, or of somespecified month range.
Arguments
x | A seasonal time series, or a class |
season.range | A vector of two numbers specifying the season range tobe considered. |
out | The form of the output. |
... | Additional arguments to be passed for changing integrationdefaults. |
Details
phenoPhase gives three measures of the phasing of a seasonal cycle:the time of the maximum (Cloern and Jassby 2008), thefulcrum orcenter of gravity, and the weighted mean season (Colebrook 1979). The latterhas sometimes been referred to in the literature as “centre ofgravity”, but it is not actually the same. These measures differ in theirsensitivity to changes in the seasonal pattern, and therefore also in theirsusceptibility to sampling variability. The time of maximum is the mostsensitive, the weighted mean the least.
These measures can be restricted to a subset of the year by giving thedesired range of seasons. This can be useful for isolating measures of, say,the spring and autumn phytoplankton blooms in temperate waters. In the caseof a seasonal time series, a non-missing value is required for every seasonor the result will beNA, so using a period shorter than one year canalso help avoid any seasons that are typically not covered by the samplingprogram. Similarly, in the case of dated observations, a shorter period canhelp avoid times of sparse data. The method for time series allows for otherthan monthly frequencies, butseason.range is always interpreted asmonths forzoo objects. The method for time series requires data forall seasons inseason.range. The method forzoo objects willprovide a result regardless of number of sampling days, so make sure thatdata are sufficient for a meaningful result.
The measures are annum-centric, i.e., they reflect the use of calendar yearas the annum, which may not be appropriate for cases in which importantfeatures occur in winter and span two calendar years. Such cases can behandled by lagging the time series by an appropriate number of months, or bysubtracting an appropriate number of days from the individual dates.
tsMake can be used to producets andzoo objectssuitable as arguments to this function.
The default parameters used for theintegrate function inphenoPhase may fail for certain datasets. Try increasing the numberof subdivisions above its default of 100 by adding, for example,subdivisions = 1000 to the arguments ofphenoPhase.
Value
A data frame with columns year, time of the maximum, fulcrum,weighted mean time and – in the case ofzoo objects – number ofobservations. In the case of seasonal time series, the results are all givenas decimal seasons of the year. In the case of dated observations, theresults can be dates, day of the year, or julian day with an origin of1970-01-01, depending on the optionout.
References
Cloern, J.E. and Jassby, A.D. (2008) Complex seasonal patternsof primary producers at the land-sea interface.Ecology Letters11, 1294–1303.
Colebrook, J.M. (1979) Continuous plankton records - seasonal cycles ofphytoplankton and copepods in the North Atlantic ocean and the North Sea.Marine Biology51, 23–32.
See Also
Examples
# ts exampley <- sfbayChla[, "s27"]p1 <- phenoPhase(y)p1apply(p1, 2, sd, na.rm = TRUE) # max.time > fulcrum > mean.wtphenoPhase(y, c(3, 10))# zoo examplesfb <- wqData(sfbay, c(1, 3, 4), 5:12, site.order = TRUE, type = "wide", time.format = "%m/%d/%Y")y <- tsMake(sfb, focus = "chl", layer = c(0, 5), type = "zoo")phenoPhase(y[, "s27"])Methods for Function phenoPhase
Description
Finds various measures of the phase of the annual cycle.
Methods
- list("signature(x = \"ts\")")
- list("signature(x =\"zoo\")")
Plots seasonal patterns for a time series
Description
Divides the time range for a monthly time series into different eras andplots composites of seasonal pattern. Can also plot each month separatelyfor the entire record.
Usage
plotSeason( x, type = c("by.era", "by.month"), num.era = 4, same.plot = TRUE, ylab = NULL, num.col = 3)Arguments
x | Monthly time series |
type | Plot seasonal pattern by era, or each month for the entirerecord |
num.era | Integer number of eras, or vector of era year breaks |
same.plot | Should eras be plotted by month? |
ylab | Optional character string label for y-axis |
num.col | Number of columns when plotted |
Details
Ifnum.era is an integer, the time range is divided into that manyequal eras; otherwise, the time range is divided into eras determined by thenum.era vector of years. When plotted"by.era" andsame.plot = FALSE, the composite patterns are plotted in a horizontalrow for easier comparison, which limits the number of periods that can beexamined. Boxes based on fewer than half of the maximum possible yearsavailable are outlined in red. Ifsame.plot = TRUE, a single plot isproduced with era boxplots arranged by month. When plotted"by.month", values for each month are first converted to standardizedanomalies, i.e., by subtraction of long-term mean and division by standarddeviation. As always, and especially with these plots, experiment with thedevice aspect ratio and size to get the clearest information.
Value
A plot (and the corresponding object of class"ggplot").
Author(s)
Alan Jassby, James Cloern
See Also
Examples
chl27 <- sfbayChla[, "s27"]plotSeason(chl27, num.era = c(1978, 1988, 1998, 2008), ylab = "Stn 27 Chl-a")## Not run: plotSeason(chl27, num.era = 3, same.plot = FALSE, ylab = "Stn 27 Chl-a")plotSeason(chl27, "by.month", ylab = "Stn 27 Chl-a")## End(Not run)Time series plot
Description
Creates line plot of vector or matrix time series, including any datasurrounded by NAs as additional points.
Usage
plotTs( x, dot.size = 1, xlab = NULL, ylab = NULL, strip.labels = colnames(x), ...)Arguments
x | matrix or vector time series |
dot.size | size of dots representing isolated data points |
xlab | optional x-axis label |
ylab | optional y-axis label |
strip.labels | labels for individual time series plots |
... | additional options |
Details
The basic time series line plot ignores data points that are adjacent tomissing data, i.e., not directly connected to other observations. This canlead to an uninformative plot when there are many missing data. If oneincludes both a point and line plot, the resulting graph can be clutteredand difficult to decipher.plotTs plots only isolated points as wellas lines joining adjacent observations.
Options are passed to the underlyingfacet_wrap function inggplot2. The main ones of interest arencol for setting thenumber of plotting columns andscales = "free_y" for allowing the yscales of the different plots to be independent.
Value
A plot or plots and corresponding object of class “ggplot”.
Author(s)
Alan Jassby, James Cloern
See Also
Examples
# Chlorophyll at 4 stations in SF Baychl <- sfbayChla[, 1:4]plotTs(chl, dot.size = 1.5, ylab = 'Chl-a', strip.labels = paste('Station', substring(colnames(chl), 2, 3)), ncol = 1, scales = "free_y")Anomaly plot of time series
Description
Series are illustrated by vertical lines extending from individual datavalues to the long-term mean. The axes are not scaled in any way. Anomalyplots are useful for visualizing shifts in time series levels.
Usage
plotTsAnom(x, xlab = NULL, ylab = NULL, strip.labels = colnames(x), ...)Arguments
x | matrix or vector time series |
xlab | optional x-axis label |
ylab | optional y-axis label |
strip.labels | labels for individual time series plots |
... | additional options |
Details
Options are passed to the underlyingfacet_wrap function inggplot2. The main ones of interest arencol for setting thenumber of plotting columns andscales = "free_y" for allowing the yscales of the different plots to be independent.
Value
A plot and corresponding object of class “ggplot”.
Author(s)
Alan Jassby, James Cloern
See Also
Examples
# Spring bloom size for 6 stations in SF Baybloom <- aggregate(sfbayChla[, 1:6], 1, meanSub, sub=3:5)plotTsAnom(bloom, ylab = 'Chl-a', strip.labels = paste('Station', substring(colnames(bloom), 2, 3)), ncol = 2, scales = "free_y")Image plot of monthly time series
Description
Monthly values are transformed into deciles or other bins, and correspondingcolors are plotted in a month by year matrix.
Usage
plotTsTile( x, plot.title = NULL, legend.title = NULL, four = TRUE, loganom = TRUE, square = TRUE, legend = TRUE, trim = TRUE, overall = TRUE, stat = c("median", "mean"))Arguments
x | monthly time series. |
plot.title | plot title. |
legend.title | legend title. |
four | logical indicating if data should be binned into 4 specialgroups or into deciles. |
loganom | logical indicating if data should be transformed intolog-anomalies. |
square | logical indicating if tiles should be square. |
legend | logical indicating if a legend should be included. |
trim | logical indicating if leading and trailing NA values should beremoved. |
overall | determines whether anomalies are calculated with respect tooverall mean or to long-term mean for the same month. |
stat | determines whether anomalies are calculated and binned usingmean or median. |
Details
Iffour = TRUE, thenx is first divided into a positive andnegative bin. Each bin is then further divided into two bins by its mean,yielding a total of four bins. Iffour=FALSE, thenx is simplydivided into deciles. In either case, each bin has its own assigned color,with colors ranging from dark blue (smallest numbers) through light blue andpink to red.
Althoughfour = TRUE can be useful for any data in which 0 representsa value with special significance, it is especially so for data convertedinto log-anomalies, i.e.,log10(x/xbar) wherexbar = mean(x,na.rm=TRUE). The mean month then has value 0, and a value of -1, forexample, indicates original data equal to one-tenth the mean. Log-anomalytransforms can be particularly appropriate for biological populations, inwhich variability is often approximately proportional to the mean.
Whenloganom = TRUE, the anomalies are calculated with respect to theoverall mean month. This differs from, for example, the log-anomalyzooplankton plot of O'Brien et al. (2008), in which a monthly anomaly iscalculated with respect to the mean value of the same month. To get thelatter behavior, setoverall = FALSE. A further option is to setstat = "median" rather than the defaultstat = "mean", inwhich casexbar = median(x, na.rm = TRUE), and the positive andnegative bins are each divided into two bins by their median instead ofmean. Using combinations of these different options can reveal complementaryinformation.
You may want to setsquare = FALSE and then adjust the plot windowmanually if you plan to use the plot in a subsequent layout or if there istoo much white space.
Value
An image plot of monthly values classified into either deciles orinto four bins as described above (and corresponding object of class“ggplot”).
Author(s)
Alan Jassby, James Cloern
References
O'Brien T., Lopez-Urrutia A., Wiebe P.H., Hay S. (editors)(2008)ICES Zooplankton Status Report 2006/2007. ICES CooperativeResearch Report 292, International Council for the Exploration of the Sea,Copenhagen, 168 p.
Examples
# plot log-anomalies in four binschl27 = sfbayChla[, 's27']plotTsTile(chl27, legend.title = 'Chl log-anomaly')# plot decilesplotTsTile(chl27, plot.title = 'SF Bay station 27', legend.title ='chlorophyll', four = FALSE, loganom = FALSE, square = FALSE)Seasonal and Regional Kendall trend test
Description
Calculates the Seasonal or Regional Kendall test of trend significance,including an estimate of the Sen slope.
Usage
seaKen( x, plot = FALSE, type = c("slope", "relative"), order = FALSE, pval = 0.05, mval = 0.5, pchs = c(19, 21), ...)Arguments
x | A numeric vector, matrix or data frame made up of seasonal timeseries |
plot | Should the trends be plotted when x is a matrix or data frame? |
type | Type of trend to be plotted, actual or relative to series median |
order | Should the plotted trends be ordered by size? |
pval | p-value for significance |
mval | Minimum fraction of seasons needed with non-missing slopeestimates |
pchs | Plot symbols for significant and not significant trendestimates, respectively |
... | Other arguments to pass to plotting function |
Details
The Seasonal Kendall test (Hirsch et al. 1982) is based on the Mann-Kendalltests for the individual seasons (seemannKen for additionaldetails).p-values provided here are not corrected for serialcorrelation among seasons.
Ifplot = TRUE, then either the Sen slope in units per year(type = "slope") or the relative slope in fraction per year(type = "relative") is plotted. The relative slope is definedidentically to the Sen slope except that each slope is divided by the firstof the two values that describe the slope. Plotting the relative slope isuseful when the variables inx are always positive and have differentunits.
The plot symbols indicate, respectively, that the trend is statisticallysignificant or not. The plot can be customized by passing any arguments usedbydotchart such asxlab, as well as graphicalparameters described inpar.
Ifmval or more of the seasonal slope estimates are missing, thenthat trend is considered to be missing. The seasonal slope estimate(mannKen), in turn, is missing if half or more of the possiblecomparisons between the first and last 20% of the years are missing.
The function can be used in conjunction withmts2ts to calculate aRegional Kendall test of significance for annualized data, along with aregional estimate of trend (Helsel and Frans 2006). See the examples below.
Value
A list of the following ifx is a vector:seaKenreturns a list with the following members:
sen.slope | Sen slope |
sen.slope.pct | Sen slope as percent of mean |
p.value | significance of slope |
miss | for each season, thefraction missing of slopes connecting first and last 20% of the years |
or amatrix with corresponding columns ifx is a matrix or data frame.
Author(s)
Alan Jassby, James Cloern
References
Helsel, D.R. and Frans, L. (2006) Regional Kendall test fortrend.Environmental Science and Technology40(13), 4066-4073.
Hirsch, R.M., Slack, J.R., and Smith, R.A. (1982) Techniques of trendanalysis for monthly water quality data.Water Resources Research18, 107-121.
See Also
Examples
# Seasonal Kendall test:chl <- sfbayChla # monthly chlorophyll at 16 stations in San Francisco BayseaKen(sfbayChla[, 's27']) # results for a single series at station 27seaKen(sfbayChla) # results for all stationsseaKen(sfbayChla, plot=TRUE, type="relative", order=TRUE)# Regional Kendall test:# Use mts2ts to change 16 series into a single series with 16 "seasons"seaKen(mts2ts(chl)) # too many missing data# better when just Feb-Apr, spring bloom period,# but last 4 stations still missing too much.seaKen(mts2ts(chl, seas = 2:4)) seaKen(mts2ts(chl[, 1:12], 2:4)) # more reliable resultRolling Seasonal Kendall trend test
Description
Calculates the Seasonal Kendall test of significance, including an estimateof the Sen slope, for rolling windows over a time series.
Usage
seaRoll( x, w = 10, plot = FALSE, pval = 0.05, mval = 0.5, pchs = c(19, 21), xlab = NULL, ylab = NULL, ...)Arguments
x | A seasonal time series vector. |
w | The window width for “rolling” estimates of slope. |
plot | Indicates if a plot should be drawn |
pval | p-value for significance |
mval | Minimum fraction of seasons needed with non-missing slopeestimates |
pchs | Plot symbols for significant and not significant trendestimates, respectively |
xlab | Optional label for x-axis |
ylab | Optional label for y-axis |
... | Other arguments to pass to plotting function |
Details
The functionseaRoll appliesseaKen to rolling time windows ofwidthw. A minimumw of five years is required. For anywindow, a season is considered missing if half or more of the possiblecomparisons between the first and last 20% of the years is missing. Ifmval or more of the seasons are missing, then that windowed trend isconsidered to be missing.
Ifplot = TRUE, a point plot will be drawn with the Sen slope plottedat the leading year of the trend window. The plot symbols indicate,respectively, that the trend is significant or not significant. The plot canbe customized by passing any arguments used byplot.default,as well as graphical parameters described inpar.
Value
seaRoll returns a matrix with one row per time windowcontaining the Sen slope, the relative Sen slope, and thep-value.Rows are labelled with the leading year of the window.
Author(s)
Alan Jassby, James Cloern
See Also
Examples
chl27 <- sfbayChla[, 's27']seaRoll(chl27)seaRoll(chl27, plot = TRUE)Determine seasonal trends
Description
Finds the trend for each season and each variable in a time series.
Usage
seasonTrend(x, plot = FALSE, type = c("slope", "relative"), pval = 0.05, ...)Arguments
x | Time series vector, or time series matrix with column names |
plot | Should the results be plotted? |
type | Type of trend to be plotted, actual or relative to series median |
pval | p-value for significance |
... | Further options to pass to plotting function |
Details
The Mann-Kendall test is applied for each season and series (in the case ofa matrix). The actual and relative Sen slope (actual divided by median forthat specific season and series); the p-value for the trend; and thefraction of missing slopes involving the first and last fifths of the dataare calculated (seemannKen).
Ifplot = TRUE, each season for each series is represented by a barshowing the trend. The fill colour indicates whetherp < 0.05 or not.If the fraction of missing slopes is 0.5 or more, the corresponding trendsare omitted.
Parameters can be passed to the plotting function, in particular, tofacet_wrap inggplot2. The most useful parameters here arencol (ornrow), which determines the number of columns (orrows) of plots, andscales, which can be set to"free_y" toallow the y-axis to change for each time series. Like allggplot2objects, the plot output can also be customized extensively by modifying andadding layers.
Value
A data frame with the following fields:
series | series names |
season | season number |
sen.slope | Sen slope in original unitsper year |
sen.slope.rel | Sen slope divided by median for that specificseason and series |
p | p-value for the trend according to theMann-Kendall test. |
missing | Proportion of slopes joining first andlast fifths of the data that are missing |
Author(s)
Alan Jassby, James Cloern
See Also
Examples
x <- sfbayChlaseasonTrend(x)seasonTrend(x, plot = TRUE, ncol = 4)San Francisco Bay water quality data
Description
Selected observations and variables from U.S. Geological Survey waterquality stations in south San Francisco Bay. Data includeCTD andnutrient measurements.
Format
sfbay is a data frame with 23207 observations (rows) of 12variables (columns):
[, 1] | date | date |
[, 2] | time | time |
[, 3] | stn | station code |
[, 4] | depth | measurement depth |
[, 5] | chl | chlorophylla |
[, 6] | dox.pct | dissolved oxygen |
[, 7] | spm | suspendedparticulate matter |
[, 8] | ext | extinctioncoefficient |
[, 9] | sal | salinity |
[, 10] | temp | water temperature |
[, 11] | nox | nitrate + nitrite |
[, 12] | nhx | ammonium |
sfbayStns is a data frame with 16 observations of 6 variables:
[, 1] | site | station code |
[,2] | description | station description |
[, 3] | lat | latitude |
[, 4] | long | longitude |
[, 5] | depthMax | maximum depth, in m |
[, 6] | distFrom36 | distance from station 36, in km |
sfbayVars is a data frame with 7 observations of 3 variables:
[, 1] | variable | water quality variablecode |
[, 2] | description | description |
[,3] | units | measurement units |
sfbayChla is a time series matrix (380 monthsx 16 stations)of average 0-5 m chlorophylla concentrations calculated from thedata insfbay.
Details
The original downloaded dataset was modified by taking a subset of sixwell-sampled stations and the period 1985–2004. Variable names were alsosimplified. The data framessfbayStns andsfbayVars describethe stations and water quality variables in more detail; they were createdfrom information at the same web site. Note that the station numbers insfbayStns have been prefixed withs to make station codes intolegal variable names.sfbayChla was constructed from the entiredownloaded sfbay dataset and encompasses the period 1969–2009.
Source
Downloaded from USGS on 2009-11-17.
Examples
data(sfbay)str(sfbay)str(sfbayStns)str(sfbayVars)plot(sfbayChla[, 1:10], main = "SF Bay Chl-a")Trend homogeneity test
Description
Tests for homogeneity of seasonal trends using method proposed by van Belleand Hughes (1984). Seasons with insufficient data as defined inmannKen are ignored.
Usage
trendHomog(x)Arguments
x | A vector time series with frequency > 1 |
Value
chisq.trend | "Trend" chi-square. |
chisq.homog | "Homogeneous" chi-square. |
p.value | For nullhypothesis that trends are homogeneous. |
n | Number of seasons used. |
Author(s)
Alan Jassby, James Cloern
References
van Belle, G. and Hughes, J.P. (1984) Nonparametric tests fortrend in water quality.Water Resources Research20, 127-136.
See Also
Examples
## Apply to a monthly vector time series to test homogeneity## of seasonal trends.x <- sfbayChla[, 's27']trendHomog(x)Convert time series to data frame
Description
Convert monthly time series vector to a yearx month data frame forseveral possible subsequent analyses. Leading and trailing empty rows areremoved.
Usage
ts2df(x, mon1 = 1, addYr = FALSE, omit = FALSE)Arguments
x | monthly time series vector |
mon1 | starting month number, i.e., first column of the data frame |
addYr | rows are normally labelled with the year of the starting month,but |
omit | if |
Details
Our main use ofts2df is to convert a single monthly time series intoa yearx month data frame for EOF analysis of interannualvariability.
monthCor finds the month-to-month correlations in a monthly timeseriesx. It is useful for deciding where to start the 12-monthperiod for anEOF analysis (mon1 ints2df), namely, ata time of low serial correlation inx.
Value
Ann x 12 data frame, wheren is the number of years.
Author(s)
Alan Jassby, James Cloern
References
Craddock, J. (1965) A meteorological application of principalcomponent analysis.Statistician15, 143–156.
See Also
Examples
# San Francisco Bay station 27 chlorophyll has the lowest serial # correlation in Oct-Nov, with Sep-Oct a close secondchl27 <- sfbayChla[, 's27']monthCor(chl27)# Convert to a data frame with October, the first month of the # local "water year", in the first columntsp(chl27)chl27 <- round(chl27, 1)ts2df(chl27, mon1 = 10, addYr = TRUE)ts2df(chl27, mon1 = 10, addYr = TRUE, omit = TRUE)Create time series from water quality data
Description
Creates a matrix time series object from an object of class"WqData",either all variables for a single site or all sites for a single variable.
Arguments
object | Object of class |
focus | Name of a site or water quality variable. |
layer | Number specifying a single depth; a numeric vector of length 2specifying top and bottom depths of layer; a list specifying multiple depthsand/or layers; or just the string |
type |
|
qprob | quantile probability, a number between 0 and 1. |
Details
Whenqprob = NULL, the function averages all included depths for eachday, the implicit assumption being that the layer is well-mixed and/or thesamples are evenly distributed with depth in the layer. Iflayer ="max.depths", then only the value at the maximum depth for each time, siteand variable combination will be used. If no layer is specified, all depthswill be used.
The function produces a matrix time series of all variables for thespecified site or all sites for the specified variable. Iftype ="ts.mon", available daily data are averaged to produce a monthly timeseries, from which a quarterly or annual series can be created if needed. Ifyou want values for the actual dates of observation, then settype ="zoo".
Whenqprob is a number from 0 to 1, it is interpreted as aprobability and the corresponding quantile is used to aggregate observationswithin the specified layer. So to get the maximum, for example, use qprob =1. Iftype = "ts.mon", the same quantile is used to aggregate all theavailable daily values.
Value
A matrix of class"mts" or"zoo".
Note
The layer list is allowed to include negative numbers, which may havebeen used in theWqData object to denote variables that apply to thewater column as a whole, such as, say, -1 for light attenuation coefficient.This enablesfocus = 's27' andlayer = list(-1, c(0, 5)) toproduce a time series matrix for station 27 that includes both attenuationcoefficient and chlorophyll averaged over the top 5 m. Negative numbers mayalso have been used in theWqData object to identify qualitativedepths such as “near bottom”, which is not uncommon in historicaldata sets. So data from such depths can be aggregated easily with other datato make these time series.
See Also
Examples
# Create new WqData objectsfb <- wqData(sfbay, c(1, 3:4), 5:12, site.order = TRUE, time.format = "%m/%d/%Y", type = "wide")# Find means in the 0-10 m layery <- tsMake(sfb, focus = "s27", layer = c(0, 10))plot(y, main = "Station 27")# Or select medians in the same layery1 <- tsMake(sfb, focus = "s27", layer = c(0, 10), qprob = 0.5)plot(y1, main = "Station 27")# Compare means:mediansapply(y / y1, 2, mean, na.rm = TRUE)# Combine a layer with a single additional depthy <- tsMake(sfb, focus = "chl", layer = list(c(0, 2), 5))plot(y, main = "Chlorophyll a, ug/L")# Use values from the deepest samplesy <- tsMake(sfb, focus = "dox", layer = "max.depths", type = "zoo")head(y)plot(y, type = "h", main = "'Bottom' DO, mg/L")Methods for Function tsMake
Description
Creates a matrix of observations indexed by time.
Methods
- list("signature(x = \"WqData\")")
tsSub
Description
tsSub
Usage
tsSub(x1, seas = 1:frequency(x1))Arguments
x1 | ts |
seas | numeric |
Author(s)
Alan Jassby, James Cloern
Construct an object of class "WqData"
Description
wqData is a constructor for the"WqData" class that is oftenmore convenient to use thannew. It converts a data.frame containingwater quality data in “long” or “wide” format to a"WqData" object. In “long” format, observations are all in onecolumn and a second column is used to designate the variable being observed.In “wide” format, observations for each variable are in a separatecolumn.
Usage
wqData( data, locus, wqdata, site.order, time.format = "%Y-%m-%d", type = c("long", "wide"))Arguments
data | Data frame containing water quality data. |
locus | Character or numeric vector designating column names ornumbers, respectively, in |
wqdata | In the case of “long” data, character or numeric vectordesignating column names or numbers, respectively, in |
site.order | If |
time.format | Conversion specification for |
type | Either “long” or “wide” |
Details
If the data are already in long format, the function has little to do butrename the data fields. If in wide format, thereshape2 package iscalled tomelt the data. The function also removesNAobservations, convertssite to (possibly ordered) factors with validvariable names, and convertstime to class"Date" or"POSIXct" and ISO 8601 format, depending ontime.format.
Value
An object of class"WqData".
Author(s)
Alan Jassby, James Cloern
References
International Organization for Standardization (2004) ISO 8601.Data elements and interchange formats - Information interchange -Representation of dates and times.
See Also
Examples
## Not run: # Create new WqData object from sfbay data. First combine date and time# into a single string after making sure that all times have 4 digits.sfb <- within(sfbay, time <- substring(10000 + time, 2, 5))sfb <- within(sfb, time <- paste(date, time, sep = ' '))sfb <- wqData(sfb, 2:4, 5:12, site.order = TRUE, type = "wide", time.format = "%m/%d/%Y %H%M")head(sfb)tail(sfb)# If time of day were not required, then the following would suffice:sfb <- wqData(sfbay, c(1,3,4), 5:12, site.order = TRUE, type = "wide", time.format = "%m/%d/%Y")## End(Not run)Miscellaneous utility functions
Description
A variety of small utilities used in other functions.
Arguments
d | A numeric matrix or data frame with depth in the first column andobservations for some variable in each of the remaining columns. |
na.rm | Should missing data be removed? |
seas | An integer vector of seasons to be retained. |
sub | An integer vector. |
w | A vector of class |
x | A numeric vector. |
x1 | A matrix or vector time series. |
y | A vector of class |
Details
date2decyear: Converts object of class"Date" to decimal yearassuming time of day is noon.
decyear2date: Converts decimal year to object of class"Date".
layerMean: Acts on a matrix or data frame with depth in the firstcolumn and observations for different variables (or different sites, ordifferent times) in each of the remaining columns. The trapezoidal mean overthe given depths is calculated for each of the variables. Replicate depthsare averaged, and missing values or data with only one unique depth arehandled. Data are not extrapolated to cover missing values at the top orbottom of the layer. The result can differ markedly from the simple meaneven for equal spacing of depths, because the top and bottom values areweighted by 0.5 in a trapezoidal mean.
leapYear:TRUE ifx is a leap year,FALSEotherwise.
meanSub: Mean of a subset of a vector.
monthNum: Converts dates to the corresponding numeric month.
tsSub: Drops seasons from a matrix or vector time series.
years: Converts dates to the corresponding numeric years.
Examples
dates <- as.Date(c("1996-01-01", "1999-12-31", "2004-02-29", "2005-03-01"))date2decyear(dates)decyear2date(c(1996.0014, 1999.9986, 2004.1626, 2005.1630))z <- c(1, 2, 3, 5, 10) # 5 depthsx <- matrix(rnorm(30), nrow = 5) # 6 variables at 5 depthslayerMean(cbind(z, x))leapYear(seq(1500, 2000, 100))leapYear(c(1996.9, 1997))## Aggregate monthly time series over Feb-Apr only.aggregate(sfbayChla, 1, meanSub, sub = 2:4)monthNum(as.Date(c("2007-03-17", "2003-06-01")))## Ignore certain seasons in a Seasonal Kendall test.c27 <- sfbayChla[, "s27"]seaKen(tsSub(c27)) # Aug and Dec missing the most key dataseaKen(tsSub(c27, seas = c(1:7, 9:11)))y <- Sys.time()years(y)Class "zoo"
Description
Registration of S3 class"zoo" as a formally defined class. Used hereto allow the"zoo" class to appear in method signatures.
Objects from the Class
A virtual Class: No objects may be createdfrom it.
See Also
Examples
showClass("zoo")