Movatterモバイル変換


[0]ホーム

URL:


CRAN Task View: Analysis of Ecological and Environmental Data

Maintainer:Gavin L. Simpson
Contact:ucfagls at gmail.com
Version:2023-12-18
URL:https://CRAN.R-project.org/view=Environmetrics
Source:https://github.com/cran-task-views/Environmetrics/
Contributions:Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see theContributing guide.
Citation:Gavin L. Simpson (2023). CRAN Task View: Analysis of Ecological and Environmental Data. Version 2023-12-18. URL https://CRAN.R-project.org/view=Environmetrics.
Installation:The packages from this task view can be installed automatically using thectv package. For example,ctv::install.views("Environmetrics", coreOnly = TRUE) installs all the core packages orctv::update.views("Environmetrics") installs all packages that are not yet installed and up-to-date. See theCRAN Task View Initiative for more details.

Introduction

This Task View contains information about using R to analyse ecological and environmental data.

The base version of R ships with a wide range of functions for use within the field of environmetrics. This functionality is complemented by a plethora of packages available via CRAN, which provide specialist methods such as ordination & cluster analysis techniques. A brief overview of the available packages is provided in this Task View, grouped by topic or type of analysis. As a testament to the popularity of R for the analysis of environmental and ecological data, aspecial volume of theJournal of Statistical Software was produced in 2007.

Those interested in environmetrics should consult theSpatial view. Complementary information is also available in theCluster, andSpatioTemporal task views.

If you have any comments or suggestions for additions or improvements, then please contact the maintainer or submit an issue or pull request in the GitHub repository linked above.

A list of available packages and functions is presented below, grouped by analysis type.

General packages

These packages are general, having wide applicability to the environmetrics field.

Modelling species responses and other data

Analysing species response curves or modelling other data often involves the fitting of standard statistical models to ecological data and includes simple (multiple) regression, Generalized Linear Models (GLM), extended regression (e.g. Generalized Least Squares [GLS]), Generalized Additive Models (GAM), and mixed effects models, amongst others.

Tree-based models

Tree-based models are being increasingly used in ecology, particularly for their ability to fit flexible models to complex data sets and the simple, intuitive output of the tree structure. Ensemble methods such as bagging, boosting and random forests are advocated for improving predictions from tree-based models and to provide information on uncertainty in regression models or classifiers.

Univariate trees

Tree-structured models for regression, classification and survival analysis, following the ideas in the CART book, are implemented in

Multivariate trees

Multivariate trees are available in

Ensembles of trees

Ensemble techniques for trees:

Graphical tools for the visualization of trees are available in packagemaptree.

Packagesmda andearth implement Multivariate Adaptive Regression Splines (MARS), a technique which provides a more flexible, tree-based approach to regression than the piecewise constant functions used in regression trees.

Ordination

R and add-on packages provide a wide range of ordination methods, many of which are specialized techniques particularly suited to the analysis of species data. The two main packages areade4 andvegan.ade4 derives from the traditions of the French school of “Analyse des Donnees” and is based on the use of the duality diagram.vegan follows the approach of Mark Hill, Cajo ter Braak and others, though the implementation owes more to that presented in Legendre & Legendre (1988)Numerical Ecology, 2nd English Edition, Elsevier. Where the two packages provide duplicate functionality, the user should choose whichever framework that best suits their background.

Model-based multivariate analysis

Multivariate model-based methods follow typical statistical modeling principles, but for multivariate responses. Model-based ordination methods reduce dimensionality of a model component (usually predictor effects of a random-effect covariance matrix), so that they share features with both ordination methods (the ordination) and regression (e.g., information criteria and residual diagnostics). It thus requires specifying a response distribution, and link function, instead of a dissimilarity measure. Unlike “classical” ordination methods, it is usually required to specify the number of ordination axesa priori of fitting the model. The following packages have different features and functionalities, but most support creating ordinations.

Dissimilarity coefficients

Much ecological analysis proceeds from a matrix of dissimilarities between samples. A large amount of effort has been expended formulating a wide range of dissimilarity coefficients suitable for ecological data. A selection of the more useful coefficients are available in R and various contributed packages.

Standard functions that produce, square, symmetric matrices of pair-wise dissimilarities include:

Functiondistance() in packageanalogue can be used to calculate dissimilarity between samples of one matrix and those of a second matrix. The same function can be used to produce pair-wise dissimilarity matrices, though the other functions listed above are faster.distance() can also be used to generate matrices based on Gower’s coefficient for mixed data (mixtures of binary, ordinal/nominal and continuous variables). Functiondaisy() in packagecluster provides a faster implementation of Gower’s coefficient for mixed-mode data thandistance() if a standard dissimilarity matrix is required. Functiongowdis() in packageFD also computes Gower’s coefficient and implements extensions to ordinal variables.

Cluster analysis

Cluster analysis aims to identify groups of samples within multivariate data sets. A large range of approaches to this problem have been suggested, but the main techniques are hierarchical cluster analysis, partitioning methods, such ask -means, and finite mixture models or model-based clustering. In the machine learning literature, cluster analysis is an unsupervised learning problem.

TheCluster task view provides a more detailed discussion of available cluster analysis methods and appropriate R functions and packages.

Hierarchical cluster analysis:

Partitioning methods:

Mixture models and model-based cluster analysis:

Ecological theory

There is a growing number of packages and books that focus on the use of R for theoretical ecological models.

Population dynamics

Estimating animal abundance and related parameters

This section concerns estimation of population parameters (population size, density, survival probability, site occupancy etc.) by methods that allow for incomplete detection. Many of these methods use data on marked animals, variously called ‘capture-recapture’, ‘mark-recapture’ or ‘capture-mark-recapture’ data.

Packagessecr can also be used to simulate data from the respective models.

See also theSpatioTemporal task view for analysis of animal tracking data underMoving objects, trajectories.

Modelling population growth rates:

Environmental time series

Additionally, a fuller description of available packages for time series analysis can be found in theTimeSeries task view.

Spatial data analysis

See theSpatial CRAN Task View for an overview of spatial analysis in R.

Extreme values

ismev provides functions for models for extreme value statistics and is support software for Coles (2001)An Introduction to Statistical Modelling of Extreme Values , Springer, New York. Other packages for extreme value theory include

See also theExtremeValue task view for further information.

Phylogenetics and evolution

Packages specifically tailored for the analysis of phylogenetic and evolutionary data include:

UseRs may also be interested in Paradis (2006)Analysis of Phylogenetics and Evolution with R, Springer, New York, a book in the“Use R!” book series from Springer.

Soil science

Several packages are now available that implement R functions for widely-used methods and approaches in pedology.

Hydrology and Oceanography

A growing number of packages are available that implement methods specifically related to the fields of hydrology and oceanography. Also see theExtreme Value and theClimatology sections for related packages.

Climatology

Several packages related to the field of climatology.

Palaeoecology and stratigraphic data

Several packages now provide specialist functionality for the import, analysis, and plotting of palaeoecological data.

Other packages

Several other relevant contributed packages for R are available that do not fit under nice headings.

CRAN packages

Core:ade4,cluster,labdsv,MASS,mgcv,vegan.
Regular:amap,analogue,aod,ape,aqp,BiodiversityR,biogrowth,boral,boussinesq,bReeze,CircStats,circular,cocorresp,Distance,dsm,dyn,dynlm,e1071,earth,ecoCopula,ecodist,EnvStats,equivalence,evd,evdbayes,evir,extRemes,FD,flexmix,forecast,fso,gam,gamair,gjam,gllvm,glmmTMB,Hmsc,ipred,ismev,lme4,maptree,marked,mclust,mda,mefa,metacom,mrds,mvabund,mvgam,nlme,nsRFA,oce,openair,ouch,party,pastecs,pgirmess,PMCMRplus,popbio,prabclus,pscl,pvclust,qualV,quantreg,quantregGrowth,R2jags,randomForest,Rbeast,Rcapture,rioja,RMark,RMAWGEN,rpart,rtop,seacarb,seas,secr,segmented,sensitivity,simecol,singleRcapture,siplab,sjSDM,soiltexture,spOccupancy,StreamMetabolism,strucchange,surveillance,TMB,tseries,unmarked,untb,VGAM,zoo.
Archived:dse,topmodel.

Related links

Other resources


[8]ページ先頭

©2009-2025 Movatter.jp