| Type: | Package |
| Title: | Empirical Dynamic Modeling ('EDM') |
| Version: | 1.15.4 |
| Date: | 2024-04-05 |
| Maintainer: | Joseph Park <JosephPark@IEEE.org> |
| Description: | An implementation of 'EDM' algorithms based on research software developed for internal use at the Sugihara Lab ('UCSD/SIO'). The package is implemented with 'Rcpp' wrappers around the 'cppEDM' library. It implements the 'simplex' projection method from Sugihara & May (1990) <doi:10.1038/344734a0>, the 'S-map' algorithm from Sugihara (1994) <doi:10.1098/rsta.1994.0106>, convergent cross mapping described in Sugihara et al. (2012) <doi:10.1126/science.1227079>, and, 'multiview embedding' described in Ye & Sugihara (2016) <doi:10.1126/science.aag0863>. |
| License: | BSD_2_clause + file LICENSE |
| LazyData: | true |
| LazyLoad: | yes |
| Imports: | methods, Rcpp (≥ 1.0.1) |
| LinkingTo: | Rcpp, RcppThread |
| Suggests: | knitr, rmarkdown, formatR |
| VignetteBuilder: | knitr |
| NeedsCompilation: | yes |
| Packaged: | 2024-04-05 17:24:24 UTC; jpark |
| Author: | Joseph Park |
| Repository: | CRAN |
| Date/Publication: | 2024-04-06 10:30:03 UTC |
Empirical dynamic modeling
Description
rEDM provides tools for data-driven time series analyses. It isbased on reconstructing multivariate state spacerepresentations from uni or multivariate time series, then projectingstate changes using various metrics applied to nearest neighbors.
rEDM is aRcpp interface to thecppEDM library ofEmpirical Dynamic Modeling tools. Functionality includes:
Simplex projection (Sugihara and May 1990)
Sequential Locally Weighted Global Linear Maps (S-map) (Sugihara 1994)
Multivariate embeddings (Dixon et. al. 1999)
Convergent cross mapping (Sugihara et. al. 2012)
Multiview embedding (Ye and Sugihara 2016)
Details
Main Functions:
Simplex- simplex projectionSMap- S-map projectionCCM- convergent cross mappingMultiview- multiview forecasting
Helper Functions:
Embed- time delay embeddingComputeError- forecast skill metricsEmbedDimension- optimal embedding dimensionPredictInterval- optimal prediction intervalPredictNonlinear- evaluate nonlinearity
Author(s)
Maintainer: Joseph Park
Authors: Joseph Park, Cameron Smith, Ethan Deyle, ErikSaberski, George Sugihara
References
Sugihara G. and May R. 1990. Nonlinear forecasting as a way ofdistinguishing chaos from measurement error in time series.Nature, 344:734-741.
Sugihara G. 1994. Nonlinear forecasting for the classification ofnatural time series. Philosophical Transactions: Physical Sciencesand Engineering, 348 (1688) : 477-495.
Dixon, P. A., M. Milicich, and G. Sugihara, 1999. Episodicfluctuations in larval supply. Science 283:1528-1530.
Sugihara G., May R., Ye H., Hsieh C., Deyle E., Fogarty M.,Munch S., 2012. Detecting Causality in Complex Ecosystems.Science 338:496-500.
Ye H., and G. Sugihara, 2016. Information leverage ininterconnected ecosystems: Overcoming the curse of dimensionality.Science 353:922-925.
Convergent cross mapping using simplex projection
Description
The state-space of a multivariate dynamical system (not a purelystochastic one) encodes coherent phase-space variable trajectories. Ifenough information is available, one can infer the presence or absenceof cross-variable interactions associated with causal links betweenvariables.CCM measures the extent to which states ofvariable Y can reliably estimate states of variable X. This can happenif X is causally influencing Y.
If cross-variable state predictability converges as more state-spaceinformation is provided, this indicates a causal link.CCMperforms this cross-variable mapping using Simplex, with convergenceassessed across a range of observational library sizes as described inSugihara et al. 2012.
Usage
CCM(pathIn = "./", dataFile = "", dataFrame = NULL, E = 0, Tp = 0, knn = 0, tau = -1, exclusionRadius = 0, columns = "", target = "", libSizes = "", sample = 0, random = TRUE, seed = 0, embedded = FALSE, includeData = FALSE, parameterList = FALSE, verbose = FALSE, showPlot = FALSE, noTime = FALSE)Arguments
pathIn | path to |
dataFile | .csv format data file name. The first column must be a timeindex or time values unless noTime is TRUE. The first row must be column names. |
dataFrame | input data.frame. The first column must be a timeindex or time values unless noTime is TRUE. The columns must be named. |
E | embedding dimension. |
Tp | prediction horizon (number of time column rows). |
knn | number of nearest neighbors. If knn=0, knn is set to E+1. |
tau | lag of time delay embedding specified as number oftime column rows. |
exclusionRadius | excludes vectors from the search space of nearest neighbors if their relative time index is within exclusionRadius. |
columns | string of whitespace separated column name(s), or vectorof column names used to create the library. If individual column namescontain whitespace place names in a vector, or, append ',' to the name. |
target | column name used for prediction. |
libSizes | string of 3 whitespace separated integer valuesspecifying the intial library size, the final library size,and the library size increment. Can also be a list of strictlyincreasing library sizes. |
sample | integer specifying the number of random samples to draw ateach library size evaluation. |
random | logical to specify random ( |
seed | integer specifying the random sampler seed. If |
embedded | logical specifying if the input data are embedded. |
includeData | logical to include statistics and predictions forevery prediction in the ensemble. |
parameterList | logical to add list of invoked parameters. |
verbose | logical to produce additional console reporting. |
showPlot | logical to plot results. |
noTime | logical to allow input data with no time column. |
Details
CCM computes the X:Y and Y:X cross-mappings in parallelusing threads.
Value
A data.frame with 3 columns. The first column isLibSizespecifying the subsampled library size. Columns 2 and 3 reportPearson correlation coefficients for the prediction of X from Y, andY from X.
ifincludeData = TRUE a named list with the following data.framesdata.frameCombo_rho columns:
| LibMeans | CCM mean correlations for each library size |
| CCM1_PredictStat | Forward cross map prediction statistics |
| CCM1_Predictions | Forward cross map prediction values |
| CCM2_PredictStat | Reverse cross map prediction statistics |
| CCM2_Predictions | Reverse cross map prediction values |
IfincludeData = TRUE andparameterList = TRUE anamed list "parameters" is added.
References
Sugihara G., May R., Ye H., Hsieh C., Deyle E., Fogarty M., Munch S., 2012. Detecting Causality in Complex Ecosystems. Science 338:496-500.
Examples
data(sardine_anchovy_sst)df = CCM( dataFrame = sardine_anchovy_sst, E = 3, Tp = 0, columns = "anchovy",target = "np_sst", libSizes = "10 70 10", sample = 100 )Compute error
Description
ComputeError evaluates the Pearson correlationcoefficient, mean absolute error and root mean square error between twonumeric vectors.
Usage
ComputeError(obs, pred)Arguments
obs | vector of observations. |
pred | vector of predictions. |
Value
A name list with components:
| rho | Pearson correlation |
| MAE | mean absolute error |
| RMSE | root mean square error |
Examples
data(block_3sp)smplx <- Simplex( dataFrame=block_3sp, lib="1 99", pred="105 190", E=3,columns="x_t",)err <- ComputeError( smplx$Observations, smplx$Predictions )Embed data with time lags
Description
Embed performs Takens time-delay embedding oncolumns.
Usage
Embed(path = "./", dataFile = "", dataFrame = NULL, E = 0, tau = -1, columns = "", verbose = FALSE)Arguments
path | path to |
dataFile | .csv format data file name. The first column must be a timeindex or time values. The first row must be column names. One of |
dataFrame | input data.frame. The first column must be a timeindex or time values. The columns must be named. One of |
E | embedding dimension. |
tau | integer time delay embedding lag specified as number oftime column rows. |
columns | string of whitespace separated column name(s), or vectorof column names used to create the library. If individual column namescontain whitespace place names in a vector, or, append ',' to the name. |
verbose | logical to produce additional console reporting. |
Details
Eachcolumns item will have E-1 time-lagged vectors created.The column name is appended with(t-n). For example, datacolumns X, Y, with E = 2 will have columns namedX(t-0) X(t-1) Y(t-0) Y(t-1).
The returned data.frame does not have a time column. The returneddata.frame is truncated by tau * (E-1) rows to remove state vectorswith partial data (NaN elements).
Value
A data.frame with lagged columns. E columns for each variable specifiedincolumns.
Examples
data(circle)embed <- Embed( dataFrame = circle, E = 2, tau = -1, columns = "x y" )Optimal embedding dimension
Description
EmbedDimension usesSimplex to evaluateprediction accuracy as a function of embedding dimension.
Usage
EmbedDimension(pathIn = "./", dataFile = "", dataFrame = NULL, pathOut = "", predictFile = "", lib = "", pred = "", maxE = 10, Tp = 1, tau = -1, exclusionRadius = 0, columns = "", target = "", embedded = FALSE, verbose = FALSE, validLib = vector(), numThreads = 4, showPlot = TRUE, noTime = FALSE)Arguments
pathIn | path to |
dataFile | .csv format data file name. The first column must be a timeindex or time values unless noTime is TRUE. The first row must be column names. |
dataFrame | input data.frame. The first column must be a timeindex or time values unless noTime is TRUE. The columns must be named. |
pathOut | path for |
predictFile | output file name. |
lib | string or vector with start and stop indices of input datarows used to create the library from observations. Mulitple row indexpairs can be specified with each pair defining the first and lastrows of time series observation segments used to create the library. |
pred | string with start and stop indices of input data rows used forpredictions. A single contiguous range is supported. |
maxE | maximum value of E to evalulate. |
Tp | prediction horizon (number of time column rows). |
tau | lag of time delay embedding specified as number oftime column rows. |
exclusionRadius | excludes vectors from the search space of nearest neighbors if their relative time index is within exclusionRadius. |
columns | string of whitespace separated column name(s), or vectorof column names used to create the library. If individual column namescontain whitespace place names in a vector, or, append ',' to the name. |
target | column name used for prediction. |
embedded | logical specifying if the input data are embedded. |
verbose | logical to produce additional console reporting. |
validLib | logical vector the same length as the number of datarows. Any data row represented in this vector as FALSE, will not beincluded in the library. |
numThreads | number of parallel threads for computation. |
showPlot | logical to plot results. |
noTime | logical to allow input data with no time column. |
Value
A data.frame with columnsE, rho.
Examples
data(TentMap)E.rho = EmbedDimension( dataFrame = TentMap, lib = "1 100", pred = "201 500",columns = "TentMap", target = "TentMap", showPlot = FALSE )Water flow to NE Everglades
Description
Cumulative weekly water flow into northeast Everglades from watercontrol structures S12C, S12D and S333 from 1980 through 2005.
Usage
EvergladesFlowFormat
A data frame with 1379 rows and 2 columns:
DateDate.
S12CD_S333_CFSCumulative weekly flow (CFS).
5-D Lorenz'96
Description
5-D Lorenz'96 timeseries with F = 8.
Usage
Lorenz5DFormat
Data frame with 1000 rows and 6 columns
TimeTime.
V1variable 1.
V2variable 2.
V3variable 3.
V4variable 4.
V5variable 5.
References
Lorenz, Edward (1996). Predictability - A problem partly solved,Seminar on Predictability, Vol. I, ECMWF.
Make embedded data block
Description
MakeBlock performs Takens time-delay embedding oncolumns. It is an internal function called byEmbedthat does not perform input error checking or validation.
Usage
MakeBlock(dataFrame, E = 0, tau = -1, columns = "", deletePartial = FALSE)Arguments
dataFrame | input data.frame. The first column must be a timeindex or time values. The columns must be named. |
E | embedding dimension. |
tau | integer time delay embedding lag specified as number oftime column rows. |
columns | string of whitespace separated column name(s) in theinput data to be embedded. |
deletePartial | boolean to delete rows with partial data. |
Details
Eachcolumns item will have E-1 time-lagged vectors created.The column name is appended with(t-n). For example, datacolumns X, Y, with E = 2 will have columns namedX(t-0) X(t-1) Y(t-0) Y(t-1).
The returned data.frame does not have a time column.
IfdeletePartial isTRUE, the returneddata.frame is truncated by tau * (E-1) rows to remove state vectorswith partial data (NaN elements).
Value
A data.frame with lagged columns. E columns for each variable specifiedincolumns.
Examples
data(TentMap)embed <- MakeBlock(TentMap, 3, 1, "TentMap")Forecasting using multiview embedding
Description
Multiview applies the method ofYe & Sugiharato find optimal combinations of variables that best represent thedynamics.
Usage
Multiview(pathIn = "./", dataFile = "", dataFrame = NULL, lib = "", pred = "", D = 0, E = 1, Tp = 1, knn = 0, tau = -1, columns = "", target = "", multiview = 0, exclusionRadius = 0, trainLib = TRUE, excludeTarget = FALSE, parameterList = FALSE, verbose = FALSE, numThreads = 4, showPlot = FALSE, noTime = FALSE)Arguments
pathIn | path to |
dataFile | .csv format data file name. The first column must be a timeindex or time values. The first row must be column names unless noTime is TRUE. |
dataFrame | input data.frame. The first column must be a timeindex or time values unless noTime is TRUE. The columns must be named. |
lib | a 2-column matrix, data.frame, 2-element vector or string of row indice pairs, where each pair specifies the first and last *rows* ofthe time series to create the library. |
pred | (same format as lib), but specifying the sections of the time series to forecast. |
D | multivariate dimension. |
E | embedding dimension. |
Tp | prediction horizon (number of time column rows). |
knn | number of nearest neighbors. If knn=0, knn is set to E+1. |
tau | lag of time delay embedding specified as number oftime column rows. |
columns | string of whitespace separated column name(s), or vectorof column names used to create the library. If individual column namescontain whitespace place names in a vector, or, append ',' to the name. |
target | column name used for prediction. |
multiview | number of multiview ensembles to average for the finalprediction estimate. |
exclusionRadius | number of adjacent observation vector rows to exclude asnearest neighbors in prediction. |
trainLib | logical to use in-sample (lib=pred) projections for theranking of column combinations. |
excludeTarget | logical to exclude embedded target column from combinations. |
parameterList | logical to add list of invoked parameters. |
verbose | logical to produce additional console reporting. |
numThreads | number of CPU threads to use in multiview processing. |
showPlot | logical to plot results. |
noTime | logical to allow input data with no time column. |
Details
Multiview embedding is a method to identify variables in amultivariate dynamical system that are most likely to contribute tothe observed dynamics. It is a multistep algorithm with these generalsteps:
Compute D-dimensional variable combination forecasts.
Rank forecasts.
Compute predictions of top combinations.
Compute multiview averaged prediction.
IfE>1, all variables are embedded to dimension E. IftrainLib isTRUE initial forecasts and ranking aredone in-sample (lib=pred) and predictions using the top rankedcombinations use the specifiedlib andpred.IftrainLib isFALSE initial forecasts and ranking usethe specifiedlib andpred, the step of computingpredictions of the top combinations is skipped.
Value
Named list with data.frames[[View, Predictions]].
data.frameView columns:
| Col_1 | column index |
| ... | column index |
| Col_D | column index |
| rho | Pearson correlation |
| MAE | mean absolute error |
| RMSE | root mean square error |
| name_1 | column name |
| ... | column name |
| name_D | column name |
IfparameterList = TRUE a named list "parameters" is added.
References
Ye H., and G. Sugihara, 2016. Information leverage in interconnected ecosystems: Overcoming the curse of dimensionality.Science 353:922-925.
Examples
data(block_3sp)L = Multiview( dataFrame = block_3sp, lib = "1 100", pred = "101 190",E = 2, columns = "x_t y_t z_t", target = "x_t" )Forecast interval accuracy
Description
PredictInterval usesSimplex to evaluateprediction accuracy as a function of forecast interval Tp.
Usage
PredictInterval(pathIn = "./", dataFile = "", dataFrame = NULL, pathOut = "./", predictFile = "", lib = "", pred = "", maxTp = 10, E = 1, tau = -1, exclusionRadius = 0, columns = "", target = "", embedded = FALSE, verbose = FALSE, validLib = vector(), numThreads = 4, showPlot = TRUE, noTime = FALSE)Arguments
pathIn | path to |
dataFile | .csv format data file name. The first column must be a timeindex or time values unless noTime is TRUE. The first row must be column names. |
dataFrame | input data.frame. The first column must be a timeindex or time values unless noTime is TRUE. The columns must be named. |
pathOut | path for |
predictFile | output file name. |
lib | string or vector with start and stop indices of input datarows used to create the library from observations. Mulitple row indexpairs can be specified with each pair defining the first and lastrows of time series observation segments used to create the library. |
pred | string with start and stop indices of input data rows used forpredictions. A single contiguous range is supported. |
maxTp | maximum value of Tp to evalulate. |
E | embedding dimension. |
tau | lag of time delay embedding specified as number oftime column rows. |
exclusionRadius | excludes vectors from the search space of nearest neighbors if their relative time index is within exclusionRadius. |
columns | string of whitespace separated column name(s), or vectorof column names used to create the library. If individual column namescontain whitespace place names in a vector, or, append ',' to the name. |
target | column name used for prediction. |
embedded | logical specifying if the input data are embedded. |
verbose | logical to produce additional console reporting. |
validLib | logical vector the same length as the number of datarows. Any data row represented in this vector as FALSE, will not beincluded in the library. |
numThreads | number of parallel threads for computation. |
showPlot | logical to plot results. |
noTime | logical to allow input data with no time column. |
Value
A data.frame with columnsTp, rho.
Examples
data(TentMap)Tp.rho = PredictInterval( dataFrame = TentMap, lib = "1 100",pred = "201 500", E = 2, columns = "TentMap", target = "TentMap",showPlot = FALSE )Test for nonlinear dynamics
Description
PredictNonlinear usesSMap to evaluateprediction accuracy as a function of the localisation parametertheta.
Usage
PredictNonlinear(pathIn = "./", dataFile = "", dataFrame = NULL, pathOut = "./", predictFile = "", lib = "", pred = "", theta = "", E = 1, Tp = 1, knn = 0, tau = -1, exclusionRadius = 0, columns = "", target = "", embedded = FALSE, verbose = FALSE, validLib = vector(), ignoreNan = TRUE, numThreads = 4, showPlot = TRUE, noTime = FALSE )Arguments
pathIn | path to |
dataFile | .csv format data file name. The first column must be a timeindex or time values unless noTime is TRUE. The first row must be column names. |
dataFrame | input data.frame. The first column must be a timeindex or time values unless noTime is TRUE. The columns must be named. |
pathOut | path for |
predictFile | output file name. |
lib | string or vector with start and stop indices of input datarows used to create the library from observations. Mulitple row indexpairs can be specified with each pair defining the first and lastrows of time series observation segments used to create the library. |
pred | string with start and stop indices of input data rows used forpredictions. A single contiguous range is supported. |
theta | A whitespace delimeted string with values of the S-map localisation parameter. An empty string will use default values of |
E | embedding dimension. |
Tp | prediction horizon (number of time column rows). |
knn | number of nearest neighbors. If knn=0, knn is set to thelibrary size. |
tau | lag of time delay embedding specified as number oftime column rows. |
exclusionRadius | excludes vectors from the search space of nearest neighbors if their relative time index is within exclusionRadius. |
columns | string of whitespace separated column name(s), or vectorof column names used to create the library. If individual column namescontain whitespace place names in a vector, or, append ',' to the name. |
target | column name used for prediction. |
embedded | logical specifying if the input data are embedded. |
verbose | logical to produce additional console reporting. |
validLib | logical vector the same length as the number of datarows. Any data row represented in this vector as FALSE, will not beincluded in the library. |
ignoreNan | logical to internally redefine library to avoid nan. |
numThreads | number of parallel threads for computation. |
showPlot | logical to plot results. |
noTime | logical to allow input data with no time column. |
Details
The localisation parametertheta weights nearestneighbors according to exp( (-theta D / D_avg) ) where D is thedistance between the observation vector and neighbor, D_avg the meandistance. If theta = 0, weights are uniformally unity correspondingto a global autoregressive model. As theta increases, neighbors incloser proximity to the observation are considered.
Value
A data.frame with columnsTheta, rho.
Examples
data(TentMapNoise)theta.rho = PredictNonlinear( dataFrame = TentMapNoise, E = 2,lib = "1 100", pred = "201 500", columns = "TentMap",target = "TentMap", showPlot = FALSE )SMap forecasting
Description
SMap performs time series forecasting based on localised(or global) nearest neighbor projection in the time series phase space asdescribed inSugihara 1994.
Usage
SMap(pathIn = "./", dataFile = "", dataFrame = NULL, lib = "", pred = "", E = 0, Tp = 1, knn = 0, tau = -1, theta = 0, exclusionRadius = 0, columns = "", target = "", embedded = FALSE, verbose = FALSE, validLib = vector(), ignoreNan = TRUE, generateSteps = 0, parameterList = FALSE, showPlot = FALSE, noTime = FALSE)Arguments
pathIn | path to |
dataFile | .csv format data file name. The first column must be a timeindex or time values unless noTime is TRUE. The first row must be column names. |
dataFrame | input data.frame. The first column must be a timeindex or time values unless noTime is TRUE. The columns must be named. |
lib | string or vector with start and stop indices of input datarows used to create the library from observations. Mulitple row indexpairs can be specified with each pair defining the first and lastrows of time series observation segments used to create the library. |
pred | string with start and stop indices of input data rows used forpredictions. A single contiguous range is supported. |
E | embedding dimension. |
Tp | prediction horizon (number of time column rows). |
knn | number of nearest neighbors. If knn=0, knn is set to thelibrary size. |
tau | lag of time delay embedding specified as number oftime column rows. |
theta | neighbor localisation exponent. |
exclusionRadius | excludes vectors from the search space of nearest neighbors if their relative time index is within exclusionRadius. |
columns | string of whitespace separated column name(s), or vectorof column names used to create the library. If individual column namescontain whitespace place names in a vector, or, append ',' to the name. |
target | column name used for prediction. |
embedded | logical specifying if the input data are embedded. |
verbose | logical to produce additional console reporting. |
validLib | logical vector the same length as the number of datarows. Any data row represented in this vector as FALSE, will not beincluded in the library. |
ignoreNan | logical to internally redefine library to avoid nan. |
generateSteps | number of predictive feedback generative steps. |
parameterList | logical to add list of invoked parameters. |
showPlot | logical to plot results. |
noTime | logical to allow input data with no time column. |
Details
Ifembedded isFALSE, the datacolumn(s) are embeddedto dimensionE with time lagtau. This embedding forms ann-columns * E-dimensional phase space for theSMap projection.If embedded isTRUE, the data are assumed to contain anE-dimensional embedding with E equal to the number ofcolumns.See the Note below for proper use of multivariate data (number ofcolumns > 1).
IfignoreNan isTRUE, the library (lib) isinternally redefined to exclude nan embedding vectors. IfignoreNan isFALSE no library adjustment is made. The(lib) can be explicitly specified to exclude nan library vectors.
Predictions are made using leave-one-out cross-validation, i.e.observation rows are excluded from the prediction regression.
In contrast toSimplex,SMap uses allavailable neighbors and weights them with an exponential decayin phase space distance with exponenttheta.theta=0uses all neighbors corresponding to a global autoregressive model.Astheta increases, neighbors closer in vicinity to theobservation are considered.
Value
A named list with three data.frames[[predictions, coefficients, singularValues]].predictions has columnsObservations, Predictions.The first column contains time or index values.
coefficients data.frame has time or index values in the first column.Columns 2 through E+2 (E+1 columns) are the SMap coefficients.
singularValues data.frame has time or index values in the first column.Columns 2 through E+2 (E+1 columns) are the SVD singularValues. Thefirst value corresponds to the SVD bias (intercept) term.
IfparameterList = TRUE a named list "parameters" is added.
Note
SMap should be called with columns explicitly corresponding todimensions E. In the univariate case (number ofcolumns = 1) withdefaultembedded = FALSE, the time series will be time-delayembedded to dimension E, SMap coefficients correspond to each dimension.
If a multivariate data set is used (number ofcolumns > 1) itmust useembedded = TRUE with E equal to the number of columns.This prevents the function from internally time-delay embedding themultiple columns to dimension E. If the internal time-delay embeddingis performed, then state-space columns will not correspond to theintended dimensions in the matrix inversion, coefficient assignment,and prediction. In the multivariate case, the user should first preparethe embedding (usingEmbed for time-delay embedding), thenpass this embedding toSMap with appropriately specifiedcolumns,E, andembedded = TRUE.
References
Sugihara G. 1994. Nonlinear forecasting for the classification of natural time series. Philosophical Transactions: Physical Sciences and Engineering, 348 (1688):477-495.
Examples
data(circle)L = SMap( dataFrame = circle, lib="1 100", pred="110 190", theta = 4,E = 2, embedded = TRUE, columns = "x y", target = "x" )Simplex forecasting
Description
Simplex performs time series forecasting based onweighted nearest neighbors projection in the time series phase space asdescribed inSugihara and May.
Usage
Simplex(pathIn = "./", dataFile = "", dataFrame = NULL, pathOut = "./", predictFile = "", lib = "", pred = "", E = 0, Tp = 1, knn = 0, tau = -1, exclusionRadius = 0, columns = "", target = "", embedded = FALSE, verbose = FALSE, validLib = vector(), generateSteps = 0, parameterList = FALSE, showPlot = FALSE, noTime = FALSE)Arguments
pathIn | path to |
dataFile | .csv format data file name. The first column must be a timeindex or time values unless noTime is TRUE. The first row must be column names. |
dataFrame | input data.frame. The first column must be a timeindex or time values unless noTime is TRUE. The columns must be named. |
pathOut | path for |
predictFile | output file name. |
lib | string or vector with start and stop indices of input datarows used to create the library from observations. Mulitple row indexpairs can be specified with each pair defining the first and lastrows of time series observation segments used to create the library. |
pred | string with start and stop indices of input data rows used forpredictions. A single contiguous range is supported. |
E | embedding dimension. |
Tp | prediction horizon (number of time column rows). |
knn | number of nearest neighbors. If knn=0, knn is set to E+1. |
tau | lag of time delay embedding specified as number oftime column rows. |
exclusionRadius | excludes vectors from the search space of nearest neighbors if their relative time index is within exclusionRadius. |
columns | string of whitespace separated column name(s), or vectorof column names used to create the library. If individual column namescontain whitespace place names in a vector, or, append ',' to the name. |
target | column name used for prediction. |
embedded | logical specifying if the input data are embedded. |
verbose | logical to produce additional console reporting. |
validLib | logical vector the same length as the number of datarows. Any data row represented in this vector as FALSE, will not beincluded in the library. |
generateSteps | number of predictive feedback generative steps. |
parameterList | logical to add list of invoked parameters. |
showPlot | logical to plot results. |
noTime | logical to allow input data with no time column. |
Details
If embedded isFALSE, the datacolumn(s) are embedded todimensionE with time lagtau. This embedding forms anE-dimensional phase space for theSimplex projection.If embedded isTRUE, the data are assumed to contain anE-dimensional embedding with E equal to the number ofcolumns.Predictions are made using leave-one-out cross-validation, i.e.observation vectors are excluded from the prediction simplex.
To assess an optimal embedding dimensionEmbedDimensioncan be applied. Accuracy statistics can be estimated byComputeError.
Value
A data.frame with columnsObservations, Predictions.The first column contains the time values.
IfparameterList = TRUE, a named list with "predictions" holding thedata.frame, "parameters" with a named list of invoked parameters.
References
Sugihara G. and May R. 1990. Nonlinear forecasting as a wayof distinguishing chaos from measurement error in time series.Nature, 344:734-741.
Examples
data( block_3sp )smplx = Simplex( dataFrame = block_3sp, lib = "1 100", pred = "101 190",E = 3, columns = "x_t", target = "x_t" )ComputeError( smplx $ Predictions, smplx $ Observations )Generate surrogate data for permutation/randomization tests
Description
SurrogateData generates surrogate data under several different null models.
Usage
SurrogateData( ts, method = c("random_shuffle", "ebisuzaki","seasonal"), num_surr = 100, T_period = 1, alpha = 0 )Arguments
ts | the original time series |
method | which algorithm to use to generate surrogate data |
num_surr | the number of null surrogates to generate |
T_period | the period of seasonality for seasonal surrogates(ignored for other methods) |
alpha | additive noise factor: N(0,alpha) |
Details
Method "random_shuffle" creates surrogates by randomly permuting the values of the original time series.
Method "Ebisuzaki" creates surrogates by randomizing the phases of a Fourier transform, preserving the power spectra of the null surrogates.
Method "seasonal" creates surrogates by computing a mean seasonal trend of the specified period and shuffling the residuals. It is presumed thatthe seasonal trend can be exracted with a smoothing spline. AdditiveGaussian noise is included according to N(0,alpha).
Value
A matrix where each column is a separate surrogate with the samelength asts.
Examples
data("block_3sp")ts <- block_3sp$x_tSurrogateData(ts, method = "ebisuzaki")Time series for a tent map with mu = 2.
Description
First-differenced time series generated from the tent maprecurrence relation with mu = 2.
Usage
TentMapFormat
Data frame with 999 rows and 2 columns
Timetime index.
TentMaptent map values.
Time series of tent map plus noise.
Description
First-differenced time series generated from the tent maprecurrence relation with mu = 2 and random noise.
Usage
TentMapNoiseFormat
Data frame with 999 rows and 2 columns
Timetime index.
TentMaptent map values.
Apple-blossom Thrips time series
Description
Seasonal outbreaks of Thrips imaginis.
References
Davidson and Andrewartha, Annual trends in a natural population ofThrips imaginisThysanoptera, Journal of Animal Ecology, 17,193-199, 1948.
Time series for a three-species coupled model.
Description
Time series generated from a discrete-time coupled Lotka-Volterra model exhibiting chaotic dynamics.
Usage
block_3spFormat
A data frame with 198 rows and 10 columns:
timetime index (# of generations)
x_tabundance of simulated species x at time t
x_t-1abundance of simulated species x at time t-1
x_t-2abundance of simulated species x at time t-2
y_tabundance of simulated species y at time t
y_t-1abundance of simulated species y at time t-1
y_t-2abundance of simulated species y at time t-2
z_tabundance of simulated species z at time t
z_t-1abundance of simulated species z at time t-1
z_t-2abundance of simulated species z at time t-2
2-D timeseries of a circle.
Description
Time series of of circle in 2-D (sin and cos).
Usage
circleFormat
A data frame with 200 rows and 3 columns:
Timetime index.
xsin component.
ycos component.
Time series for the Paramecium-Didinium laboratory experiment
Description
Time series of Paramecium and Didinium abundances (#/mL) from an experiment by Veilleux (1979)
Usage
paramecium_didiniumTime series for the California Current Anchovy-Sardine-SST system
Description
Time series of Pacific sardine landings (CA), Northern anchovy landings (CA), and sea-surface temperature (3-year average) at the SIO pier and Newport pier
Usage
sardine_anchovy_sstFormat
yearyear of measurement
anchovyanchovy landings, scaled to mean = 0, sd = 1
sardinesardine landings, scaled to mean = 0, sd = 1
sio_sst3-year running average of sea surface temperature at SIO pier, scaled to mean = 0, sd = 1
np_sst3-year running average of sea surface temperature at Newport pier, scaled to mean = 0, sd = 1