Movatterモバイル変換

Type:

Package

Title:

Build Graphs for Landscape Genetics Analysis

Version:

1.8.0

Maintainer:

Paul Savary <psavary@protonmail.com>

Description:

Build graphs for landscape genetics analysis. This set of functions can be used to import and convert spatial and genetic data initially in different formats, import landscape graphs created with 'GRAPHAB' software (Foltete et al., 2012) <doi:10.1016/j.envsoft.2012.07.002>, make diagnosis plots of isolation by distance relationships in order to choose how to build genetic graphs, create graphs with a large range of pruning methods, weight their links with several genetic distances, plot and analyse graphs,compare them with other graphs. It uses functions from other packages such as 'adegenet' (Jombart, 2008) <doi:10.1093/bioinformatics/btn129> and 'igraph' (Csardiet Nepusz, 2006)https://igraph.org/. It also implements methods commonly used in landscape genetics to create graphs, described by Dyer et Nason (2004) <doi:10.1111/j.1365-294X.2004.02177.x> and Greenbaum et Fefferman (2017) <doi:10.1111/mec.14059>, and to analyse distance data (van Strien et al., 2015) <doi:10.1038/hdy.2014.62>.

Depends:

R(≥ 3.1.0)

License:

GPL-2

Encoding:

UTF-8

LazyData:

true

Imports:

adegenet, ggplot2, stringr, igraph, stats, spatstat.geom,spatstat.linnet, Matrix, vegan, utils, methods, pegas, MASS,tidyr, sp, sf, hierfstat, rappdirs, gdistance, raster, foreign,ecodist, Rdpack

Suggests:

knitr, rmarkdown

RdMacros:

Rdpack

RoxygenNote:

7.2.1

VignetteBuilder:

knitr, rmarkdown

NeedsCompilation:

Packaged:

2023-01-30 00:23:37 UTC; paul

Author:

Paul Savary

[aut, cre], Gilles Vuidel

[ctb], Tyler Rudolph [ctb], Alexandrine Daniel [ctb]

Repository:

CRAN

Date/Publication:

2023-01-30 14:00:05 UTC

Add attributes to the nodes of a graph

Description

The function adds attributes to the nodes of a graph fromeither an object of classdata.frame or from a shapefile layer.The node IDs in the input objects must be the same as in the graph object.

Usage

add_nodes_attr(  graph,  input = "df",  data,  dir_path = NULL,  layer = NULL,  index = "Id",  include = "all")

Arguments

graph

A graph object of classigraph.

input

A character string indicating the nature of theinput data from which come the attributes to add to the nodes.

If 'input = "shp"', then attributes come from the attribute table ofa shapefile layer of type point.
If 'input = "df"', then attributes come from an object of classdata.frame

In both cases, input attribute table or dataframe must have a column withthe exact same values as the node IDs.

data

(only if 'input = "df"') The name of the object ofclassdata.frame with the attributes to add to the nodes.

dir_path

(only if 'input = "shp"') The path (character string) to thedirectory containing the shapefile layer of type point whose attributetable contains the attributes to add to the nodes.

layer

(only if 'input = "shp"') The name (character string) of theshapefile layer of type point (without extension, ex.: "nodes" refersto "nodes.shp" layer) whose attribute table contains the attributesto add to the nodes.

index

The name (character string) of the column with the nodes namesin the input data (column of the attribute table or of the dataframe).

include

A character string (vector) indicating which columns of theinput data will be added as nodes' attributes.By default, 'include = "all"', i.e. every column of the input data is added.Alternatively, 'include' can be a vector with the names of the columns to add(ex.: "c('x', 'y', 'pop_name')").

Details

The graph can be created with the functiongraphab_to_igraph by importing output from Graphab projects.Values of the metrics computed at the node level with Graphab can then beadded to such a graph with this function.

Value

A graph object of classigraph

Author(s)

P. Savary

Examples

data("data_tuto")graph <- data_tuto[[3]]df_nodes <- data.frame(Id = igraph::V(graph)$name,                       Area = runif(50, min = 10, max = 60))graph <- add_nodes_attr(graph,                        data = df_nodes,                        input = "df",                        index = "Id",                        include = "Area")

Check whether the option 'nomerge' was used when building the landscapegraph with Graphab

Description

The function checks whether the option 'nomerge' was used whenbuilding the landscape graph with Graphab

Usage

check_merge(proj_end_path)

Arguments

proj_end_path

The path to the project .xml file.

Value

The function returns a logical indicating whether 'nomerge' was used.If nomerge=TRUE, then it returns FALSE. If nomerge=FALSE, it returns TRUE.

Author(s)

P. Savary

Examples

## Not run: proj_name <- "grphb_ex"check_merge(proj_name = proj_name)## End(Not run)

Compare two correlation coefficients obtained from different sample sizes

Description

The function compares two correlation coefficients obtained fromdifferent sample sizes using Z-Fisher transformation.

Usage

compar_r_fisher(data)

Arguments

data

An object of classdata.frame with at least 4 columnsof data used to perform the test.4 columns must be called "n1", "n2", "r1" and "r2".

n1 and n2 are the sizes of the samples from which r1 and r2were computed respectively.
r1 and r2 are Pearson's correlation coefficients

Details

The Z-Fisher method consists in computing z scores from thecorrelation coefficients and to compare these z scores.z scores are computed as follows :Let n1 and r1 be the sample size and the correlation coefficient,z1 = (1/2)*log( (1+r1) / (1-r1) )Then, a test's statistic is computed from z1 and z2 :Z = (z1-z2) / sqrt( (1/(n1-3)) + (1/(n2-3)))If Z is above the limit given by the alpha value, then the difference betweenr1 and r2 is significant

Value

An object of classdata.frame with the same columns as 'data'and 4 columns more : z1, z2 (respective z-scores), Z (test's statistic) andp (p-value) of the test.

Author(s)

P. Savary

Examples

df <- data.frame(n1 = rpois(n = 40, lambda = 85),                 n2 = rpois(n = 40, lambda = 60),                 r1 = runif(n = 40, min = 0.6, max = 0.85),                 r2 = runif(n = 40, min = 0.55, max = 0.75))data <- compar_r_fisher(df)

Compute modules from a graph by maximising modularity

Description

The function computes modules from a graph by maximisingmodularity.

Usage

compute_graph_modul(  graph,  algo = "fast_greedy",  node_inter = NULL,  nb_modul = NULL)

Arguments

graph

An object of classigraph. Its nodes must have names.

algo

A character string indicating the algorithm used to createthe modules withigraph.

Ifalgo = 'fast_greedy' (default),functioncluster_fast_greedy fromigraphis used (Clauset et al., 2004).
Ifalgo = 'walktrap', functioncluster_walktrapfromigraph is used (Pons et Latapy, 2006) with 4 steps(default options).
Ifalgo = 'louvain', functioncluster_louvainfromigraph is used (Blondel et al., 2008). In that case, the numberof modules created in each graph is imposed.
Ifalgo = 'optimal', functioncluster_optimalfromigraph is used (Brandes et al., 2008) (can be very long).In that case, the number of modules created in each graph is imposed.

node_inter

(optional, default = NULL) A character string indicatingwhether the links of the graph are weighted by distances or by similarityindices. It is only used to compute the modularity index. It can be:

'distance': Link weights correspond to distances. Nodes that are closeto each other will more likely be in the same module.
'similarity': Link weights correspond to similarity indices. Nodes thatare similar to each other will more likely be in the same module. Inverselink weights are then used to compute the modularity index.

nb_modul

(optional , default = NULL) A numeric or integer valueindicating the number of modules in the graph. When this number is notspecified, the optimal value is retained.

Value

Adata.frame with the node names and the correspondingmodule ID.

Author(s)

P. Savary

Examples

data("data_tuto")mat_gen <- data_tuto[[1]]graph <- gen_graph_thr(mat_w = mat_gen, mat_thr = mat_gen,                            thr = 0.8)res_mod <- compute_graph_modul(graph = graph,                                algo = "fast_greedy",                                node_inter = "distance")

Compute graph-theoretic metrics from a graph at the node level

Description

The function computes graph-theoretic metric values at thenode level.

Usage

compute_node_metric(  graph,  metrics = c("deg", "close", "btw", "str", "siw", "miw"),  weight = TRUE)

Arguments

graph

An object of classigraph. Its nodes must have names.

metrics

Character vector specifying the graph-theoreticmetrics computed at the node-level in the graphsGraph-theoretic metrics can be:

Degree (metrics = c("deg", ...))
Closeness centrality index (metrics = c("close",...))
Betweenness centrality index (metrics = c("btw",...))
Strength (sum of the weights of the links connected to a node)(metrics = c("str",...))
Sum of the inverse weights of the links connected to anode (metrics = c("siw", ...), default)
Mean of the inverse weights of the links connected to anode (metrics = c("miw", ...))

By default, the vectormetrics includes all these metrics.

weight

Logical which indicates whether the links are weighted duringthe calculation of the centrality indices betweenness and closeness.(default:weight = TRUE). Link weights are interpreted as distanceswhen computing the shortest paths. They should then be inversely proportionalto the strength of the relationship between nodes (e.g. to fluxes).

Value

Adata.frame with the node names and the metrics computed.

Author(s)

P. Savary

Examples

data(data_ex_genind)mat_gen <- mat_gen_dist(x = data_ex_genind, dist = "DPS")graph <- gen_graph_thr(mat_w = mat_gen, mat_thr = mat_gen,                            thr = 0.8)res_met <- compute_node_metric(graph)

Fit a model to convert cost-distances into Euclidean distances

Description

The function fits a model to convert cost-distances intoEuclidean distances as implemented in Graphab software.

Usage

convert_cd(  mat_euc,  mat_ld,  to_convert,  method = "log-log",  fig = TRUE,  line_col = "black",  pts_col = "#999999")

Arguments

mat_euc

A symmetricmatrix ordist object withpairwise geographical Euclidean distances between populations or samplesites. It will be the explanatory variable, and only values from the offdiagonal lower triangle will be used.

mat_ld

A symmetricmatrix ordist object with pairwiselandscape distances between populations or sample sites. These distances canbe cost-distances or resistance distances, among others. It will be theexplained variable, and only values from the off diagonal lower trianglewill be used.

to_convert

A numeric value or numeric vector with Euclidean distancesto convert into cost-distances.

method

A character string indicating the method used to fit the model.

If 'method = "log-log"' (default), then the model takes thefollowing form : log(ld) ~ A + B * log(euc)
If 'method = "lm"', then the model takes the following form :ld ~ A + B * euc

fig

Logical (default = TRUE) indicating whether a figure is plotted

line_col

(if 'fig = TRUE') Character string indicating the colorused to plot the line (default: "blue"). It must be a hexadecimal colorcode or a color used by default in R.

pts_col

(if 'fig = TRUE') Character string indicating the colorused to plot the points (default: "#999999"). It must be a hexadecimal colorcode or a color used by default in R.

Details

IDs in 'mat_euc' and 'mat_ld' must be the same and refer to the samesampling site or populations, and both matrices must be orderedin the same way.Matrix of Euclidean distance 'mat_euc' can be computed using the functionmat_geo_dist.Matrix of landscape distance 'mat_ld' can be computed using the functionmat_cost_dist.Before the log calculation, 0 distance values are converted into 1,so that they are 0 after this calculation.

Value

A list of output (converted values, estimated parameters, R2)and optionally a ggplot2 object to plot

Author(s)

P. Savary

References

Foltête J, Clauzel C, Vuidel G (2012).“A software tool dedicated to the modelling of landscape networks.”Environmental Modelling & Software,38, 316–327.

Examples

data("data_tuto")mat_ld <- data_tuto[[2]][1:10, 1:10] * 1000mat_euc <- data_tuto[[1]][1:10, 1:10] * 50000to_convert <- c(30000, 40000)res <- convert_cd(mat_euc = mat_euc,                  mat_ld = mat_ld,                  to_convert = to_convert, fig = FALSE)

data_ex_genind genetic dataset

Description

Genetic dataset from genetic simulation on CDPOP200 individuals, 10 populations20 microsatellite loci (3 digits coding)100 generations simulated

Usage

data_ex_genind

Format

An object of type 'genind'

Details

The simulation was made with CDPOP during 100 generations.Dispersal was possible between the 10 populations. Its probability dependedon the cost distance between populations, calculated on a simulatedresistance surface (raster). Mutations were not possible. Therewere initially 600 alleles in total (many disappeared because of drift).Population stayed constantwith a sex-ratio of 1. Generations did not overlap.This simulation includes a part of stochasticity and these data resultfrom only 1 simulation run.

References

Landguth EL, Cushman SA (2010).“CDPOP: a spatially explicit cost distance population genetics program.”Molecular Ecology Resources,10(1), 156–161.

Examples

data("data_ex_genind")length(unique(data_ex_genind@pop))

data_ex_gstud genetic dataset

Description

Genetic dataset from genetic simulation on CDPOP200 individuals, 10 populations20 microsatellite loci (3 digits coding)100 generations simulated

Usage

data_ex_gstud

Format

A 'data.frame' with columns:

ID: Individual ID
POP: Population name
LOCI-1 to LOCI-20: 20 loci columns with microsatellite data with3 digits coding, alleles separated by ":", and blank missing data(class 'locus' fromgstudio)

Examples

data("data_ex_gstud")str(data_ex_gstud)length(unique(data_ex_gstud$POP))

data_ex_loci genetic dataset

Description

Genetic dataset from genetic simulation on CDPOP200 individuals, 10 populations20 microsatellite loci (3 digits coding)100 generations simulated

Usage

data_ex_loci

Format

An object of class 'loci' and 'data.frame' with the columns :

population: Population name
Other columns: 20 loci columns with microsatellite data with3 digits coding, alleles separated by "/", and missing data noted "NA/NA"

Row names correspond to individuals' ID

Examples

data("data_ex_loci")length(unique(data_ex_loci$population))

data_simul_genind genetic dataset

Description

Genetic dataset from genetic simulation on CDPOP1500 individuals, 50 populations20 microsatellite loci (3 digits coding)50 generations simulated

Usage

data_simul_genind

Format

An object of type 'genind'

Details

The simulation was made with CDPOP during 50 generations.Dispersal was possible between the 50 populations. Its probability dependedon the cost distance between populations, calculated on a simulatedresistance surface (raster). Mutations were not possible. Therewere initially 600 alleles in total (many disappeared because of drift).Population stayed constantwith a sex-ratio of 1. Generations did not overlap.This simulation includes a part of stochasticity and these data resultfrom only 1 simulation run.

References

Landguth EL, Cushman SA (2010).“CDPOP: a spatially explicit cost distance population genetics program.”Molecular Ecology Resources,10(1), 156–161.

Examples

data("data_simul_genind")length(unique(data_simul_genind@pop))

data_tuto : data used to generate the vignette

Description

Data used to generate the vignette

Usage

data_tutodata_tuto

Format

Several outputs or inputs to show how the package works in a list

mat_dps: Genetic distance matrix example
mat_pg: Second genetic distance matrix example
graph_ci: Genetic independence graph example
dmc: Output of the function 'dist_max_corr'
land_graph: Landscape graph example
mat_ld: Landscape distance matrix example

Several outputs or inputs to show how the package works in a list

dmc: Output of the function 'dist_max_corr'
graph_ci: Genetic independence graph example
mat_dps: Genetic distance matrix example
mat_pg: Second genetic distance matrix example

Examples

data("data_tuto")mat_dps <- data_tuto[[1]]str(mat_dps)data("data_tuto")mat_dps <- data_tuto[[1]]str(mat_dps)

Convert degrees to radians

Description

The function converts degree to radians

Usage

deg2rad(deg)

Arguments

deg

A coordinate in degrees

Value

The coordinate in radians

Author(s)

P. Savary

Examples

deg2rad(40.75170)

Convert an edge-list data.frame into a pairwise matrix

Description

The function converts an edge-list data.frameinto a symmetric pairwise matrix

Usage

df_to_pw_mat(data, from, to, value)

Arguments

data

An object of classdata.frame

from

A character string indicating the name of the column with the IDof the origins

to

A character string indicating the name of the column with the IDof the arrivals

value

A character string indicating the name of the column with thevalues corresponding to each pair

Details

The matrix is a symmetric matrix. Be careful, you shall not providea data.frame with different values corresponding to the pair 1-2 and 2-1 asan example. Ideally, for a complete matrix, data should have n(n-1)/2 rowsif values are computed between n objects.

Value

A pairwise matrix

Author(s)

P. Savary

Examples

data(pts_pop_simul)suppressWarnings(mat_geo <- mat_geo_dist(pts_pop_simul,                 ID = "ID",                 x = "x",                y = "y"))g <- gen_graph_topo(mat_w = mat_geo,                    mat_topo = mat_geo,                    topo = "comp")df <- data.frame(igraph::as_edgelist(g))df$w <- igraph::E(g)$weightdf_to_pw_mat(df, from = "X1", to = "X2", value = "w")

Calculate the Great-Circle distance between two points using theHarversine formula (hvs)

Description

The function calculates the Great-Circle distance between twopoints specified by radian latitude/longitude using theHarversine formula (hvs)

Usage

dist_gc_hvs(long1, lat1, long2, lat2)

Arguments

long1

Point 1 longitude in radians

lat1

Point 1 latitude in radians

long2

Point 2 longitude in radians

lat2

Point 2 latitude in radians

Value

The distance between points 1 and 2 in meters

Author(s)

P. Savary

Examples

dist_gc_hvs(long1 = -73.99420, lat1 = 40.75170,            long2 = -87.63940, lat2 = 41.87440)

Calculate the Great-Circle distance between two points using theSpherical Law of Cosines (slc)

Description

The function calculates the Great-Circle distance between twopoints specified by radian latitude/longitude using the Spherical Lawof Cosines (slc)

Usage

dist_gc_slc(long1, lat1, long2, lat2)

Arguments

long1

Point 1 longitude in radians

lat1

Point 1 latitude in radians

long2

Point 2 longitude in radians

lat2

Point 2 latitude in radians

Value

The distance between points 1 and 2 in meters

Author(s)

P. Savary

Examples

dist_gc_slc(long1 = -73.99420, lat1 = 40.75170,            long2 = -87.63940, lat2 = 41.87440)

Calculate the Great-Circle distance between two points using theVincenty inverse formula for ellipsoids (vicenty)

Description

The function calculates the Great-Circle distance between twopoints specified by radian latitude/longitude using theVincenty inverse formula for ellipsoids (vicenty)

Usage

dist_gc_vicenty(long1, lat1, long2, lat2)

Arguments

long1

Point 1 longitude in radians

lat1

Point 1 latitude in radians

long2

Point 2 longitude in radians

lat2

Point 2 latitude in radians

Value

The distance between points 1 and 2 in meters

Author(s)

P. Savary

Examples

dist_gc_vicenty(long1 = -73.99420, lat1 = 40.75170,            long2 = -87.63940, lat2 = 41.87440)

Compute the Great Circle distance between two points

Description

The function computes the Great Circle distance between twotwo points defined by their longitudes and latitudes.

Usage

dist_great_circle(long1, long2, lat1, lat2, method = "vicenty")

Arguments

long1

project name, project dir in which proj_name.xml is found

long2

raster.tif INT2S path or present in wd,

lat1

habitat code in the raster file

lat2

default 0, minimum habitat size in ha

method

default NULL nodata code in the raster file

Author(s)

P. Savary

Examples

dist_great_circle(long1 = -73.99420,                  lat1 = 40.75170,                  long2 = -87.63940,                  lat2 = 41.87440,                  method = "vicenty")

Compute the distance at which the correlation between genetic distanceand landscape distance is maximal

Description

The function enables to compute the distance at which thecorrelation between genetic distance and landscape distance is maximal,using a method similar to that employed by van Strien et al. (2015).Iteratively, distance threshold values are tested. For each value, all thepopulation pairs separated by a landscape distance larger than the thresholdare removed before the Mantel correlation coefficient between geneticdistance and landscape distance is computed.The distance threshold at which the correlation is the strongest is thenidentified. A figure showing the evolution of the correlation coefficientswhen landscape distance threshold increases is plotted.

Usage

dist_max_corr(  mat_gd,  mat_ld,  interv,  from = NULL,  to = NULL,  fig = TRUE,  thr_gd = NULL,  line_col = "black",  pts_col = "#999999")

Arguments

mat_gd

A symmetricmatrix ordist object with pairwisegenetic distances between populations or sample sites.

mat_ld

A symmetricmatrix ordist object with pairwiselandscape distances between populations or sample sites. These distancescan be Euclidean distances, cost-distances or resistance distances,among others.

interv

A numeric or integer value indicating the interval betweenthe different distance thresholds for which the correlation coefficientsare computed.

from

(optional) The minimum distance threshold value at which thecorrelation coefficient is computed.

to

(optional) The maximum distance threshold value at which thecorrelation coefficient is computed.

fig

Logical (default = TRUE) indicating whether a figure is plotted.

thr_gd

(optional) A numeric or integer value used to removegenetic distance values from the data before the calculation.All genetic distances values above 'thr_gd' are removed from the data.This parameter can be used especially when there are outliers.

line_col

(optional, if fig = TRUE) A character string indicating thecolor used to plot the line (default: "blue"). It must be a hexadecimal colorcode or a color used by default in R.

pts_col

(optional, if fig = TRUE) A character string indicating thecolor used to plot the points (default: "#999999"). It must be a hexadecimalcolor code or a color used by default in R.

Details

IDs in 'mat_gd' and 'mat_ld' must be the same and refer to the samesampling sites or populations, and both matrices must be orderedin the same way.The correlation coefficient between genetic distance and landscape distancecomputed is a Mantel correlation coefficient. If there are less than 50pairwise values, the correlation is not computed, as invan Strien et al. (2015). Such a method can be subject to criticism froma strict statistical point of view given correlation coefficients computedfrom samples of different size are compared.The matrix of genetic distance 'mat_gd' can be computed usingmat_gen_dist.The matrix of landscape distance 'mat_ld' can be computed usingmat_geo_dist when the landscape distance needed is aEuclidean geographical distance.Mantel correlation coefficients are computed usingthe functionmantel.

Value

A list of objects:

The distance at which the correlation is the highest.
The vector of correlation coefficients at the differentdistance thresholds
The vector of the different distance thresholds
A ggplot2 object to plot

Author(s)

P. Savary

References

Van Strien MJ, Holderegger R, Van Heck HJ (2015).“Isolation-by-distance in landscapes: considerations for landscape genetics.”Heredity,114(1), 27.

Examples

data("data_tuto")mat_gen <- data_tuto[[1]]mat_dist <- data_tuto[[2]]*1000res_dmc <- dist_max_corr(mat_gd = mat_gen,                         mat_ld = mat_dist,                         from = 32000, to = 42000,                         interv = 5000,                         fig = FALSE)

Prune a graph using the 'percolation threshold' method

Description

The function allows to prune a graph by removingthe links with the largest weights until the graph breaks intotwo components. The returned graph is the last graph with only onecomponent.

Usage

g_percol(x, val_step = 20)

Arguments

x

A symmetricmatrix or adist object with pairwisedistances between nodes

val_step

The number of classes to create to search for thethreshold value without testing all the possibilities. By default,'val_step = 20'.

Value

A graph object of typeigraph

Author(s)

P. Savary

Examples

data(data_ex_genind)suppressWarnings(mat_w <- graph4lg::mat_geo_dist(data = pts_pop_ex,                            ID = "ID",                            x = "x",                            y = "y"))g_percol(x = mat_w)

Create an independence graph of genetic differentiationfrom genetic data of class genind

Description

The function allows to create genetic graphs from genetic databy applying the conditional independence principle. Populations whose allelicfrequencies covary significantly once the covariance with the otherpopulations has been taken into account are linked on the graphs.

Usage

gen_graph_indep(  x,  dist = "basic",  cov = "sq",  pcor = "magwene",  alpha = 0.05,  test = "EED",  adj = "none",  output = "igraph")

Arguments

x

An object of classgenind that contains the multilocusgenotype (format 'locus') of the individuals as well as their populationand their geographical coordinates.

dist

A character string indicating the method used to compute themultilocus genetic distance between populations

If 'dist = 'basic” (default), then the multilocus genetic distance iscomputed using a Euclidean genetic distance formula (Excoffier et al., 1992)
If 'dist = 'weight”, then the multilocus genetic distance is computedas in Fortuna et al. (2009). It is a Euclidean genetic distance giving moreweight to rare alleles
If 'dist = 'PG”, then the multilocus genetic distance is computed asin popgraph::popgraph function, following several steps of PCA and SVD(Dyer et Nason, 2004).
If 'dist = 'PCA”, then the genetic distance is computed following aPCA of the matrix of allelic frequencies by population. It is a Euclideangenetic distance between populations in the multidimensional space definedby all the independent principal components.

cov

A character string indicating the formula used to compute thecovariance matrix from the distance matrix

If 'cov = 'sq” (default), then the covariance matrix is calculatedfrom the matrix of squared distances as in Everitt et Hothorn (2011)
If 'cov = 'dist”, then the covariance matrix is calculated from thematrix of distances as in Dyer et Nason (2004) and popgraph function

pcor

A character string indicating the way the partial correlationmatrix is computed from the covariance matrix.

If 'pcor = 'magwene”, the steps followed are the same as inMagwene (2001) and in popgraph::popgraph function. It is the recommendedoption as it meets mathematical requirements.
If 'pcor = 'other”, the steps followed are the same as usedby Fortuna et al. (2009). They are not consistent with the approachof Magwene (2001).

alpha

A numeric value corresponding to the statistical tolerancethreshold used to test the difference from 0 of the partial correlationcoefficients. By default, 'alpha=0.05'.

test

A character string indicating the method used to test thesignificance of the partial correlation coefficients.

If 'test = 'EED” (default), then the Edge Exclusion Deviancecriterion is used (Whittaker, 2009). Although other methods exist, this isthe most common and thus the only one implemented here.

adj

A character string indicating the way of adjusting p-values toassess the significance of the p-values

If 'adj = 'none” (default), there is no p-value adjustment correction
If 'adj = 'holm”, p-values are adjusted using the sequentialBonferroni correction (Holm, 1979)
If 'adj = 'bonferroni”, p-values are adjusted using the classicBonferroni correction
If 'adj = 'BH”, p-values are adjusted using Benjamini et Hochberg(1995) correction controlling false discovery rate

output

A character string indicating the matrices included inthe output list.

If 'output = 'all” (default), then D (distance matrix),C (covariance matrix), Rho (partial correlation matrix),M (graph incidence matrix) and S (strength matrix) are included
If 'output = 'dist_graph”, then the distance matrix D is returnedonly with the values corresponding to the graph edges
If 'output = 'str_graph”, then the strength values matrix S isreturned only with the values corresponding to the graph edges
If 'output = 'inc”, then the binary adjacency matrix M is returned
If 'output = 'igraph”, then a graph of classigraphis returned

Details

The function allows to vary many parameters such as the geneticdistance used, the formula used to compute the covariance, the statisticaltolerance threshold, the p-values adjustment, among others.

Value

Alist of objects of classmatrix, an object ofclassmatrix or a graph object of classigraph

Author(s)

P. Savary

References

Dyer RJ, Nason JD (2004).“Population graphs: the graph theoretic shape of genetic structure.”Molecular ecology,13(7), 1713–1727.Benjamini Y, Hochberg Y (1995).“Controlling the false discovery rate: a practical and powerful approach to multiple testing.”Journal of the royal statistical society. Series B (Methodological), 289–300.Bowcock AM, Ruiz-Linares A, Tomfohrde J, Minch E, Kidd JR, Cavalli-Sforza LL (1994).“High resolution of human evolutionary trees with polymorphic microsatellites.”nature,368(6470), 455–457.Everitt B, Hothorn T (2011).An introduction to applied multivariate analysis with R.Springer.Excoffier L, Smouse PE, Quattro JM (1992).“Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data.”Genetics,131(2), 479–491.Fortuna MA, Albaladejo RG, Fernández L, Aparicio A, Bascompte J (2009).“Networks of spatial genetic variation across species.”Proceedings of the National Academy of Sciences,106(45), 19044–19049.Holm S (1979).“A simple sequentially rejective multiple test procedure.”Scandinavian journal of statistics, 65–70.Magwene PM (2001).“New tools for studying integration and modularity.”Evolution,55(9), 1734–1745.Wermuth N, Scheidt E (1977).“Algorithm AS 105: fitting a covariance selection model to a matrix.”Journal of the Royal Statistical Society. Series C (Applied Statistics),26(1), 88–92.Whittaker J (2009).Graphical models in applied multivariate statistics.Wiley Publishing.

Examples

data(data_ex_genind)dist_graph_test <- gen_graph_indep(x = data_ex_genind, dist = "basic",                             cov = "sq", pcor = "magwene",                             alpha = 0.05, test = "EED",                             adj = "none", output = "igraph")

Create a graph of genetic differentiationusing a link weight threshold

Description

The function allows to construct a genetic graph whoselinks' weights are larger or lower than a specific threshold

Usage

gen_graph_thr(mat_w, mat_thr = NULL, thr, mode = "larger")

Arguments

mat_w

A symmetric (pairwise)matrix or adist objectwhose elements will be the links' weights

mat_thr

(optional) A symmetric (pairwise) distancematrixor adist object whose values will be used for the pruning basedon the threshold value.

thr

The threshold value (logically between min(mat_thr)and max(mat_thr))(integer or numeric)

mode

If 'mode = 'larger” (default), all the links whose weight is largerthan 'thr' are removed.
If 'mode = 'lower”, all the links whose weight is lowerthan 'thr' are removed.

Details

If 'mat_thr' is not defined, 'mat_w' is used for the pruning.Matrices 'mat_w' and 'mat_thr' must have the same dimensions and thesame rows' and columns' names.Values in 'mat_thr' matrix must be positive. Negative values from'mat_w' are transformed into zeros.The function works only for undirected graphs.If dist objects are specified, it is assumed that colnames androw.names of mat_w and mat_thr refer to the same populations/locations.

Value

A graph object of classigraph

Author(s)

P. Savary

Examples

mat_w <- mat_gen_dist(x = data_ex_genind, dist = 'DPS')suppressWarnings(mat_thr <- mat_geo_dist(pts_pop_ex,                 ID = "ID",                 x = "x",                y = "y"))mat_thr <- mat_thr[row.names(mat_w), colnames(mat_w)]graph <- gen_graph_thr(mat_w, mat_thr, thr = 6000, mode = "larger")

Create a graph of genetic differentiation witha specific topology

Description

The function constructs a genetic graph witha specific topology from genetic and/or geographical distance matrices

Usage

gen_graph_topo(mat_w, mat_topo = NULL, topo = "gabriel", k = NULL)

Arguments

mat_w

A symmetric (pairwise)matrix or adist objectwhose elements will be the links' weights

mat_topo

(optional) A symmetric (pairwise) distancematrixor adist object whose values will be used for the pruning method.

topo

Which topology does the created graph have?

If 'topo = 'gabriel” (default), the resulting graph will be aGabriel graph (Gabriel et al., 1969). It means that there is a linkbetween nodes x and y if and only ifd_{xy}^{2} \leq \min(\sqrt{d_{xz}^{2}+d_{yz}^{2}}),with z any other node of the graph.
If 'topo = 'mst”, the resulting graph will have the topologyof a minimum spanning tree. It means that the graph will not includeany cycle (tree) and it will be the subgraph with a tree topology withthe minimum total links' weight (based on 'mat_topo' values).
If 'topo = 'percol”, if the link of the resulting graph with theminimum weight is removed, then the graph breaks into two components.
If 'topo = 'comp”, a complete graph whose links are weighted withvalues from 'mat_w' is created.
If 'topo = 'knn”, a k-nearest neighbor graph whose links areweighted with values from 'mat_w' is created. If the distance between node iand node j is among the k-th smallest distances between node i and the othernodes according to distances in matrix 'mat_topo', then there is a linkbetween i and j in the resulting graph. Therefore, a node can be connectedto more than two nodes because the nearest node to node j is not necessarilyamong the k nearest neighbors to node i. Let d1 be the smallest distancefrom node i to other nodes, if there are k nodes or more at this distancefrom node i, they are all connected to i. If there are less than k nodesconnected to i at a distance d1, then we consider nodes at a distance d2from i. In the latter case, all the nodes at a distance d2 from i areconnected to i.

k

(if 'topo = 'knn”) An integer which indicates the number ofnearest neighbors considered to create the K-nearest neighbor graph. k mustbe lower than the total number of nodes minus 1.

Details

If 'mat_topo' is not defined, 'mat_w' is used for the pruning.Matrices 'mat_w' and 'mat_topo' must have the same dimensions and thesame rows' and columns' names.Values in 'mat_topo' matrix must be positive. Negative values from'mat_w' are transformed into zeros.The function works only for undirected graphs.Note that the topology 'knn' works best when 'mat_topo' contains distancevalues from a continuous value range, thereby avoiding equal distancesbetween a node and the others. are more than k nodes locatedat distances in the k-th smallest distancesIf dist objects are specified, it is assumed that colnames androw.names of mat_w and mat_topo refer to the same populations/locations.

Value

A graph object of classigraph

Author(s)

P. Savary

References

Gabriel KR, Sokal RR (1969).“A new statistical approach to geographic variation analysis.”Systematic zoology,18(3), 259–278.

Examples

mat_w <- mat_gen_dist(x = data_ex_genind, dist = 'DPS')suppressWarnings(mat_topo <- mat_geo_dist(pts_pop_ex,                 ID = "ID",                 x = "x",                y = "y"))mat_topo <- mat_topo[row.names(mat_w), colnames(mat_w)]graph <- gen_graph_topo(mat_w, mat_topo, topo = "mst")

Convert a GENEPOP file into a genind object

Description

The function converts a text file in the format used by GENEPOPsoftware into a genind object

Usage

genepop_to_genind(path, n.loci, pop_names = NULL, allele.digit.coding = 3)

Arguments

path

A character string with the path leading to the GENEPOP filein format .txt, or alternatively the name of this file in the workingdirectory.

n.loci

The number of loci in the GENEPOP file (integer or numeric).

pop_names

(optional) Populations' names in the same orderas in the GENEPOP file.Vector object (class character) of the same length as the numberof populations.Without this parameter, populations are numbered from 1 to the numberof populations.

allele.digit.coding

Number indicating whether alleles are codedwith 3 (default) or 2 digits.

Details

This function uses functions frompegas package.GENEPOP file should can include microsatellites loci or SNPs with allele namesof length 2 or 3 (noted as 01, 02, 03 or 04 for SNPs).The loci line(s) must not start with a spacing.

Value

An object of typegenind.

Author(s)

P. Savary

References

Raymond M (1995).“GENEPOP: Population genetics software for exact tests and ecumenism. Vers. 1.2.”Journal of Heredity,86, 248–249.

Examples

path_in <- system.file('extdata', 'gpop_simul_10_g100_04_20.txt',                       package = 'graph4lg')file_n <- file.path(tempdir(), "gpop_simul_10_g100_04_20.txt")file.copy(path_in, file_n, overwrite = TRUE)genepop_to_genind(path = file_n, n.loci = 20,                  pop_names = as.character(order(as.character(1:10))))file.remove(file_n)

Convert a genind object into a GENEPOP file

Description

The function converts an object of classgenind intoa GENEPOP file.It then allows to use the functionalities of the GENEPOP software andits derived packageGENEPOP on R, as well as some functionsfrom other packages (differentiation test, F-stats calculations,HWE test,...).It is designed to be used with diploid microsatellite data withalleles coded with 2 or 3 digits or SNPs genind objects.

Usage

genind_to_genepop(x, output = "data.frame")

Arguments

x

An object of classgenindfrom packageadegenet.

output

A character string indicating the option used to select whatthe function will return:

Ifoutput = "data.frame"(default), then the function willreturn an object 'x' of classdata.frame ready to be saved as atext file with the following command:write.table(x, file = "file_name.txt", quote=FALSE,row.names=FALSE, col.names=FALSE)
Ifoutput = "path_to_file/file_name.txt", then the functionwill write a text file named 'file_name.txt' in the directory correspondingto 'path_to_file'. Without 'path_to_file', the text file is written in thecurrent working directory. The text file has the format required by GENEPOPsoftware.

Value

An object of typedata.frame ifouput = "data.frame".Ifoutput is the path and/or the file name of a text file, thennothing is returned in R environment but a text file is created with thespecified file name, either in the current working directory or in thespecified folder.

Warning

Confusion

Do not confound this function withgenind2genpopfromadegenet. The latter converts an object of classgenindinto an object of classgenpop, whereasgenind_to_genepopconverts an object of classgenind into a text file compatible withGENEPOP software (Rousset, 2008).

Allele coding

This function can handle genetic data with different allele coding: 2 or 3digit coding for microsatellite data or 2 digit coding for SNPs (A,C,T,Gbecome respectively 01, 02, 03, 04).

Individuals order

When individuals in input data are not ordered by populations, individualsfrom the same population can be separated by individuals from otherpopulations. It can be problematic when calculating then pairwise distancematrices. Therefore, in such a case, individuals are ordered by populationsand populations ordered in alphabetic order.

Author(s)

P. Savary

References

Raymond M (1995).“GENEPOP: Population genetics software for exact tests and ecumenism. Vers. 1.2.”Journal of Heredity,86, 248–249.

Examples

data(data_ex_genind)x <- data_ex_geninddf_genepop <- suppressWarnings(genind_to_genepop(x,                                                 output = "data.frame"))

Convert a genind object into a STRUCTURE file

Description

The function converts an object of classgenind intoa STRUCTURE file.It is designed to be used with diploid microsatellite data withalleles coded with 2 or 3 digits or SNPs genind objects.

Usage

genind_to_structure(x, output = "")

Arguments

x

An object of classgenindfrom packageadegenet.

output

A character string of the formoutput = "path_to_file/file_name.txt". Then, the functionwill write a text file named 'file_name.txt' in the directory correspondingto 'path_to_file'. Without 'path_to_file', the text file is written in thecurrent working directory. The text file has the format required by STRUCTUREsoftware.

Value

Ifoutput is the path and/or the file name of a text file, then nothing is returned in R environment but a text file is created with the specified file name, either in the current working directory or in thespecified folder.

Warning

Allele coding

This function can handle genetic data with different allele coding: 2 or 3digit coding for microsatellite data or 2 digit coding for SNPs (A,C,T,Gbecome respectively 01, 02, 03, 04).

Individuals order

Author(s)

P. Savary

Examples

data(data_ex_genind)x <- data_ex_genindgenind_to_structure(x,                    output = tempfile(fileext = ".txt"))

Download Graphab if not present on the user's machine

Description

The function checks for the presence of Graphab (.jar) on theuser's machine and downloads it if absent. It also checks that users haveinstalled java on their machine.

Usage

get_graphab(res = TRUE, return = FALSE)

Arguments

res

Logical indicating whether a message says if Graphab has beendownloaded or not.

return

Logical indicating whether the function returns a 1 or a 0to indicate if Graphab has been downloaded or not.

Details

If the download does not work, you can create a directory named'graph4lg_jar' in the directoryrappdirs::user_data_dir() and copyGraphab software downloaded fromhttps://thema.univ-fcomte.fr/productions/download.php?name=graphab&version=2.8&username=Graph4lg&institution=R

Value

If res = TRUE, the function displays a message indicating to userswhat has been done.If return = TRUE, it returns a 0 if Graphab is already on the machine anda 1 if it has been downloaded.

Author(s)

P. Savary

Examples

## Not run: get_graphab()## End(Not run)

Get linkset computed in the Graphab project

Description

The function gets a linkset computed in the Graphab project

Usage

get_graphab_linkset(proj_name, linkset, proj_path = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is.

linkset

A character string indicating the name of the link setwhose properties are imported. The link set has been created with Graphabor usinggraphab_link function.

proj_path

(optional) A character string indicating the path to thedirectory that contains the project directory. It should be used when theproject directory is not in the current working directory. Default is NULL.When 'proj_path = NULL', the project directory is equal togetwd().

Details

See more information in Graphab 2.8 manual:https://sourcesup.renater.fr/www/graphab/download/manual-2.8-en.pdf.This function works iflink{get_graphab} function works correctly.

Value

A data.frame with the link properties (from, to, cost-distance,Euclidean distance)

Author(s)

P. Savary

Examples

## Not run: get_graphab_linkset(proj_name = "grphb_ex",               linkset = "lkst1")## End(Not run)

Get cost values associated with a linkset in a Graphab project

Description

The function extracts the cost values associated with alinkset in a Graphab project

Usage

get_graphab_linkset_cost(proj_name, linkset, proj_path = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml will be created.

linkset

(optional, default=NULL) A character string indicating thename of the link set used to create the graph. Link sets can be createdwithgraphab_link.

proj_path

Value

The function returns a data.frame with the cost values correspondingto every raster code value.

Author(s)

P. Savary

Examples

## Not run: proj_name <- "grphb_ex"get_graphab_linkset_cost(proj_name = proj_name,               linkset = "lkst1")## End(Not run)

Get metrics computed at the node in the Graphab project

Description

The function gets the metrics computed at the node-level inthe Graphab project

Usage

get_graphab_metric(proj_name, proj_path = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is.

proj_path

Details

The imported metrics describe the patches and have been computedfrom the different graphs created in the Graphab project.See more information in Graphab 2.8 manual:https://sourcesup.renater.fr/www/graphab/download/manual-2.8-en.pdf

Value

A data.frame with metrics computed at the patch level.

Author(s)

P. Savary

Examples

## Not run: get_graphab_metric(proj_name = "grphb_ex")## End(Not run)

Get unique raster codes from a Graphab project

Description

The function extracts unique raster codes from a Graphab project

Usage

get_graphab_raster_codes(proj_name, mode = "all", proj_path = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml will be created.

mode

A character string equal to either 'all' (default) or 'habitat'indicating whether the returned codes are all the codes of the source rasterused for creating the project or only the code corresponding to thehabitat patches.

proj_path

Value

The function returns a vector of integer values corresponding tothe source raster codes (all the codes or only the one corresponding tohabitat patches).

Author(s)

P. Savary

Examples

## Not run: proj_name <- "grphb_ex"get_graphab_raster_codes(proj_name = proj_name,               mode = "all")## End(Not run)

Compute Gini coefficient from a numeric vector

Description

The function computes Gini coefficient from a numeric vector

Usage

gini_coeff(x, unbiased = TRUE)

Arguments

x

A numeric vector with positive values

unbiased

A logical value indicating whether the computed coefficientis biased or not. Unbiased value are equal to n/(n-1) times the biased ones.

Value

A numeric value corresponding to the Gini coefficient of the numericvector

Author(s)

P. Savary

Examples

x <- c(10, 2, 5, 15)gini <- gini_coeff(x)

Compare the partition into modules of two graphs

Description

The function computes the Adjusted Rand Index (ARI) tocompare two graphs' partitions into modules or clusters more generally.Both graphs must have the same number of nodes, but not necessarily the samenumber of links. They must also have the same node names and in thesame order.

Usage

graph_modul_compar(  x,  y,  mode = "graph",  nb_modul = NULL,  algo = "fast_greedy",  node_inter = "distance",  data = NULL)

Arguments

x

The first graph object

Ifmode = 'graph' (default),x is a graph object ofclassigraph.Then, its nodes must have the same names as in graphy.
Ifmode = 'data.frame',x refers to a column ofthedata.frame 'data'.Thenx must be a character string indicating the name of thecolumn of 'data' with the modules' labels of the nodes in the first graph.In that case, the column can be of classnumeric,characterorfactor but will be converted into anumeric vectorin any case.
Ifmode = 'vector',x is a vector ofclasscharacter,factor ornumeric.In that case, it must have the same length as vectory andwill be converted into anumeric vector.

y

The second graph objectSame classes possible as forx. Must be of the same format asx

mode

A character string indicating whether x and y are igraph objects,vectors or columns from a data.frame.mode can be 'graph','data.frame' or 'vector'.

nb_modul

(if x and y are igraph objects) A numeric or integer valueor a numeric vector with 2 elements indicating the number of modules tocreate in both graphs.

Ifnb_modul is a numeric value, then the same number of modulesare created in both graphs.
Ifnb_modul is a numeric vector of length 2, then thenumbers of modules created in graphsx andy are thefirst and second elements ofnb_modul, respectively.

algo

(if x and y are igraph objects) A character string indicating thealgorithm used to create the modules withigraph.

Ifalgo = 'fast_greedy' (default),functioncluster_fast_greedyfromigraph is used (Clauset et al., 2004).
Ifalgo = 'walktrap' (default), functioncluster_walktrapfromigraph is used (Pons et Latapy, 2006) with4 steps (default options).
Ifalgo = 'louvain', functioncluster_louvainfromigraph is used (Blondel et al., 2008).In that case, the number of modules created in each graph is imposed.
Ifalgo = 'optimal', functioncluster_optimalfromigraph is used (Brandes et al., 2008) (can be very long).In that case, the number of modules created in each graph is imposed.

node_inter

(optional, if x and y are igraph objects,default is 'none') A character string indicating whether the links of thegraph are weighted by distances or by similarity indices. It is only usedto compute the modularity index. It can be:

'distance': Link weights correspond to distances. Nodes that are closeto each other will more likely be in the same module.
'similarity': Link weights correspond to similarity indices. Nodes thatare similar to each other will more likely be in the same module. Inverselink weights are then used to compute the modularity index.
'none': Links are not weighted for the computation, which is onlybased on graph topology.

Two different weightings can be used to create the modules of the two graphs.

Ifnode_inter is a character string, then the same linkweighting is used for both graphs.
Ifnode_inter is a character vector of length 2, thenthe link weighting used by the algorithm to create the modules ofgraphsx andy is determined by the first and second elementsofnode_inter, respectively.

data

(if x and y are columns from a data.frame) An object of classdata.frame with at least two columns and as many rows as there are nodesin the graphs compared. The columns indicate the modules of each node in2 different classifications.

Details

This index takes values between -1 and 1. It measures how oftenpairs of nodes pertaining to the same module in one graph also pertain tothe same module in the other graph.Therefore, large values indicate that both partitions are similar.The Rand Index can be defined as the frequency of agreement between twoclassifications into discrete classes. It is the number of times a pair ofelements are classified into the same class or in two different classesin both compared classifications, divided by the total number of possiblepairs of elements. The Rand Index is between 0 and 1 but its maximum valuedepends on the number of elements. Thus, another 'adjusted' index wascreated, the Adjusted Rand Index. According to the Hubert etArabie's formula, the ARI is computed as follows:ARI=\frac{Index - Expected index}{Maximum index - Expected index}where the values of Index, Expected index and Maximum index are computedfrom a contingency table.This function usesadjustedRandIndex from packagemclust whichapplies the Hubert and Arabie's formula for the ARI.This function works for undirected graphs only.

Value

The value of the ARI

Author(s)

P. Savary

References

Dyer RJ, Nason JD (2004).“Population graphs: the graph theoretic shape of genetic structure.”Molecular ecology,13(7), 1713–1727.Hubert L, Arabie P (1985).“Comparing partitions.”Journal of classification,2(1), 193–218.Clauset A, Newman ME, Moore C (2004).“Finding community structure in very large networks.”Physical review E,70(6).Blondel VD, Guillaume J, Lambiotte R, Lefebvre E (2008).“Fast unfolding of communities in large networks.”Journal of Statistical Mechanics - Theory and Experiment,10.Brandes U, Delling D, Gaertler M, Gorke R, Hoefer M, Nikoloski Z, Wagner D (2008).“On modularity clustering.”IEEE transactions on knowledge and data engineering,20(2), 172–188.Pons P, Latapy M (2006).“Computing communities in large networks using random walks.”J. Graph Algorithms Appl.,10(2), 191–218.

Examples

data(data_ex_genind)data(pts_pop_ex)mat_dist <- suppressWarnings(graph4lg::mat_geo_dist(data=pts_pop_ex,      ID = "ID",      x = "x",      y = "y"))mat_dist <- mat_dist[order(as.character(row.names(mat_dist))),                      order(as.character(colnames(mat_dist)))]graph_obs <- gen_graph_thr(mat_w = mat_dist, mat_thr = mat_dist,                            thr = 24000, mode = "larger")mat_gen <- mat_gen_dist(x = data_ex_genind, dist = "DPS")graph_pred <- gen_graph_topo(mat_w = mat_gen, mat_topo = mat_dist,                            topo = "gabriel")ARI <- graph_modul_compar(x = graph_obs, y = graph_pred)

Compare the local properties of the nodes from two graphs

Description

The function computes a correlation coefficient between thegraph-theoretic metric values computed at the node-level in two graphssharing the same nodes. It allows to assess whether the connectivityproperties of the nodes in one graph are similar to that of the same nodesin the other graph. Alternatively, the correlation is computed betweena graph-theoretic metric values and the values of an attribute associatedto the nodes of a graph.

Usage

graph_node_compar(  x,  y,  metrics = c("siw", "siw"),  method = "spearman",  weight = TRUE,  test = TRUE)

Arguments

x

An object of classigraph.Its nodes must have the same names as in graphy.

y

An object of classigraph.Its nodes must have the same names as in graphx.

metrics

Two-element character vector specifying the graph-theoreticmetrics computed at the node-level in the graphs or the node attributevalues to be correlated to these metrics.Graph-theoretic metrics can be:

Degree (metrics = c("deg", ...))
Closeness centrality index (metrics = c("close",...))
Betweenness centrality index (metrics = c("btw",...))
Strength (sum of the weights of the links connected to a node)(metrics = c("str",...))
Sum of the inverse weights of the links connected to anode (metrics = c("siw", ...), default)
Mean of the inverse weights of the links connected to anode (metrics = c("miw", ...))

Node attributes must have the same names as in theigraph object,and must refer to an attribute with numerical values.The vectormetrics is composed of two character values.When a node attribute has the same name as a metric computable from thegraph, node attributes are given priority.

method

A character string indicating which correlation coefficientis to be computed ("pearson","kendall" or"spearman" (default)).

weight

test

Logical. Should significance testing be performed?(default = TRUE)

Details

The correlation coefficients between the metrics can be computedin different ways, as initial assumptions (e.g. linear relationship) arerarely verified. Pearson's r, Spearman's rho and Kendall's tau can becomputed (from functioncor).Whenx is similar toy, then the correlation is computedbetween two metrics characterizing the nodes of the same graph.

Value

Alist summarizing the correlation analysis.

Author(s)

P. Savary

Examples

data(data_ex_genind)data(pts_pop_ex)mat_dist <- suppressWarnings(graph4lg::mat_geo_dist(data = pts_pop_ex,      ID = "ID",      x = "x",      y = "y"))mat_dist <- mat_dist[order(as.character(row.names(mat_dist))),                      order(as.character(colnames(mat_dist)))]graph_obs <- gen_graph_thr(mat_w = mat_dist, mat_thr = mat_dist,                           thr = 9500, mode = "larger")mat_gen <- mat_gen_dist(x = data_ex_genind, dist = "DPS")graph_pred <- gen_graph_topo(mat_w = mat_gen, mat_topo = mat_dist,                            topo = "gabriel")res_cor <- graph_node_compar(x = graph_obs, y = graph_pred,                             metrics = c("siw", "siw"), method = "spearman",                             test = TRUE, weight = TRUE)

Create a graph with a minimum planar graph topology

Description

The function constructs a graph with a minimum planargraph topology

Usage

graph_plan(crds, ID = NULL, x = NULL, y = NULL, weight = TRUE)

Arguments

crds

Adata.frame with the spatialcoordinates of the point set (the graph nodes). It must have three columns:

ID: A character string indicating the name of the points(graph nodes).
x: A numeric or integer indicating the longitude of the graph nodes.
y: A numeric or integer indicating the latitude of the graph nodes.

ID

A character string indicating the name of the columnofcrds with the point IDs

x

A character string indicating the name of the columnofcrds with the point longitude

y

A character string indicating the name of the columnofcrds with the point latitude

weight

A character string indicating whether the links ofthe graph are weighted by Euclidean distances (TRUE)(default) or not (FALSE).When the graph links do not have weights in Euclidean distances, each linkis given a weight of 1.

Details

A delaunay triangulation is performed in order to get theplanar graph.

Value

A planar graph of classigraph

Author(s)

P. Savary

Examples

data(pts_pop_ex)g_plan <- graph_plan(crds = pts_pop_ex,             ID = "ID",             x = "x",             y = "y")

Visualize the topological differences between two spatial graphs on a map

Description

The function enables to compare two spatial graphs byplotting them highlighting the topological similarities and differencesbetween them. Both graphs should share the same nodes and cannotbe directed graphs.

Usage

graph_plot_compar(x, y, crds)

Arguments

x

A graph object of classigraph.Its nodes must have the same names as in graphy.

y

A graph object of classigraph.Its nodes must have the same names as in graphx.

crds

Adata.frame with the spatialcoordinates of the graph nodes (bothx andy).It must have three columns:

ID: Name of the graph nodes (character string).The names must be the same as the node names of the graphs ofclassigraph (igraph::V(graph)$name)
x: Longitude of the graph nodes (numeric or integer).
y: Latitude of the graph nodes (numeric or integer).

Details

The graphsx andy of classigraph must havenode names (not necessarily in the same order as IDs in crds,given a merging is done).

Value

A ggplot2 object to plot

Author(s)

P. Savary

Examples

data(pts_pop_ex)data(data_ex_genind)mat_w <- mat_gen_dist(data_ex_genind, dist = "DPS")mat_dist <- mat_geo_dist(data = pts_pop_ex,                         ID = "ID",                         x = "x",                         y = "y")mat_dist <- mat_dist[order(as.character(row.names(mat_dist))),                   order(as.character(colnames(mat_dist)))]g1 <- gen_graph_topo(mat_w = mat_w, topo = "mst")g2 <- gen_graph_topo(mat_w = mat_w, mat_topo = mat_dist, topo = "gabriel")g <- graph_plot_compar(x = g1, y = g2,                       crds = pts_pop_ex)

Convert a graph into a edge list data.frame

Description

The function converts a graph into a edge list data.frame

Usage

graph_to_df(graph, weight = TRUE)

Arguments

graph

A graph object of classigraph

weight

Logical. If TRUE (default), then the column 'link' of theoutput data.frame contains the weights of the links. If FALSE,it contains only 0 and 1.

Details

The 'graph' nodes must have names. Links must have weights if'weight = TRUE'.

Value

An object of classdata.frame with a link ID, the origin nodes('from') and arrival nodes ('to') and the linkvalue ('link')(weighted or binary)

Author(s)

P. Savary

Examples

data(pts_pop_ex)suppressWarnings(mat_geo <- mat_geo_dist(pts_pop_ex,                 ID = "ID",                 x = "x",                y = "y"))g1 <- gen_graph_thr(mat_w = mat_geo,                    mat_thr = mat_geo,                    thr = 20000)g1_df <- graph_to_df(g1,                     weight = TRUE)

Export a spatial graph to shapefile layers

Description

The function enables to export a spatial graph toshapefile layers.

Usage

graph_to_shp(  graph,  crds,  mode = "both",  crds_crs,  layer,  dir_path,  metrics = FALSE)

Arguments

graph

A graph object of classigraph

crds

(if 'mode = 'spatial”) Adata.frame with the spatialcoordinates of the graph nodes. It must have three columns:

ID: Name of the graph nodes (will be converted into character string).The names must the same as the node names of the graph object ofclassigraph (igraph::V(graph)$name)
x: Longitude (numeric or integer) of the graph nodes in the coordinatesreference system indicated with the argument crds_crs.
y: Latitude (numeric or integer) of the graph nodes in the coordinatesreference system indicated with the argument crds_crs.

mode

Indicates which shapefile layers will be created

If 'mode = 'both” (default), then two shapefile layers are created,one for the nodes and another for the links.
If 'mode = 'node”, a shapefile layer is created for the nodes only.
If 'mode = 'link”, a shapefile layer is created for the links only.

crds_crs

An integer indicating the EPSG code of the coordinatesreference system to use.The projection and datum are given in the PROJ.4 format.

layer

A character string indicating the suffix of the name ofthe layers to be created.

dir_path

A character string corresponding to the path to the directoryin which the shapefile layers will be exported. Ifdir_path = "wd",then the layers are created in the current working directory.

metrics

(not considered if 'mode = 'link”) Logical. Should graphnode attributes integrated in the attribute table of the node shapefilelayer? (default: FALSE)

Value

Create shapefile layers in the directory specified with the parameter'dir_path'.

Author(s)

P. Savary

Examples

## Not run: data(data_tuto)mat_w <- data_tuto[[1]]gp <- gen_graph_topo(mat_w = mat_w, topo = "gabriel")crds_crs <- 2154crds <- pts_pop_simullayer <- "graph_dps_gab"graph_to_shp(graph = gp, crds = pts_pop_simul, mode = "both",             crds_crs = crds_crs,             layer = "test_fonct",             dir_path = tempdir(),             metrics = FALSE) ## End(Not run)

Compute an index comparing graph topologies

Description

The function computes several indices in order to compare twograph topologies. One of the graph has the "true" topology the other issupposed to reproduce. The indices are then a way to assess the reliabilityof the latter graph.Both graphs must have the same number of nodes, but not necessarily thesame number of links. They must also have the same node names and inthe same order.

Usage

graph_topo_compar(obs_graph, pred_graph, mode = "mcc", directed = FALSE)

Arguments

obs_graph

A graph object of classigraph with n nodes.It is the observed graph thatpred_graph is supposed to approach.

pred_graph

A graph object of classigraph with n nodes.It is the predicted graph that is supposed to be akin toobs_graph.

mode

A character string specifying which index to compute in orderto compare the topologies of the graphs.

If 'mode = 'mcc” (default), the Matthews CorrelationCoefficient (MCC) is computed.
If 'mode = 'kappa”, the Kappa index is computed.
If 'mode = 'fdr”, the False Discovery Rate (FDR) is computed.
If 'mode = 'acc”, the Accuracy is computed.
If 'mode = 'sens”, the Sensitivity is computed.
If 'mode = 'spec”, the Specificity is computed.
If 'mode = 'prec”, the Precision is computed.

directed

Logical (TRUE or FALSE) specifying whether both graphsare directed or not.

Details

The indices are calculated from a confusion matrix countingthe number of links that are in the "observed" graph ("true") and alsoin the "predicted" graph (true positives : TP), that are in the "observed"graph but not in the "predicted" graph (false negatives : FN), that are notin the "observed" graph but in the "predicted" graph (false positives : FP)and that are not in the "observed" graph and not in the "predicted" graphneither (true negatives: TN). K is the total number of links in the graphs.K is equal ton\times(n-1) if the graphs are directed and to\frac{n\times(n-1)}{2} if they are not directed, with n the numberof nodes.OP = TP + FN, ON = TN + FP, PP = TP + FP and PN = FN + TN.

The Matthews Correlation Coefficient (MCC) is computed as follows:MCC = \frac{TP\times TN-FP\times FN}{\sqrt{(TP+FP)(TP+FN)(TN+FP)(TN+FN)}}

The Kappa index is computed as follows:Kappa = \frac{K\times (TP + TN) - (ON \times PN) - (OP \times PP)}{K^{2} - (ON \times PN) - (OP \times PP)}

The False Discovery Rate (FDR) is calculated as follows:FDR = \frac{FP}{TP+FP}

The Accuracy is calculated as follows:Acc = \frac{TP + TN}{K}

The Sensitivity is calculated as follows:Sens = \frac{TP}{TP+FN}

The Specificity is calculated as follows:Spec = \frac{TN}{TN+FP}

The Precision is calculated as follows:Prec = \frac{TP}{TP+FP}

Self loops are not taken into account.

Value

The value of the index computed

Author(s)

P. Savary

References

Dyer RJ, Nason JD (2004).“Population graphs: the graph theoretic shape of genetic structure.”Molecular ecology,13(7), 1713–1727.Baldi P, Brunak S, Chauvin Y, Andersen CA, Nielsen H (2000).“Assessing the accuracy of prediction algorithms for classification: an overview.”Bioinformatics,16(5), 412–424.Matthews BW (1975).“Comparison of the predicted and observed secondary structure of T4 phage lysozyme.”Biochimica et Biophysica Acta (BBA)-Protein Structure,405(2), 442–451.

Examples

data(data_ex_genind)data(pts_pop_ex)mat_dist <- suppressWarnings(graph4lg::mat_geo_dist(data=pts_pop_ex,      ID = "ID",      x = "x",      y = "y"))mat_dist <- mat_dist[order(as.character(row.names(mat_dist))),                      order(as.character(colnames(mat_dist)))]graph_obs <- gen_graph_thr(mat_w = mat_dist, mat_thr = mat_dist,                            thr = 15000, mode = "larger")mat_gen <- mat_gen_dist(x = data_ex_genind, dist = "DPS")graph_pred <- gen_graph_topo(mat_w = mat_gen, mat_topo = mat_dist,                            topo = "gabriel")graph_topo_compar(obs_graph = graph_obs,                  pred_graph = graph_pred,                  mode = "mcc",                  directed = FALSE)

Computes custom capacities of patches in the Graphab project

Description

The function computes custom capacities of patchesin the Graphab project

Usage

graphab_capacity(  proj_name,  mode = "area",  patch_codes = NULL,  exp = NULL,  ext_file = NULL,  thr = NULL,  linkset = NULL,  codes = NULL,  cost_conv = FALSE,  weight = FALSE,  proj_path = NULL,  alloc_ram = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is. It can be created withgraphab_project

mode

A character string indicating the way capacities arecomputed. It must be either:

mode='area'(default): The capacity of the patches is computedas the area of each habitat patch. The argumentexp makes itpossible to raise area to a power given by an exposant.
mode='ext_file': The capacity of the patches is given by anexternal .csv file. See argumentext_file below.
mode='neigh': The capacity is computed depending on theneighbouring raster cells from each habitat patch. The number of cellswith a value given bycodes argument is summed up to thedistancethr. This number can be weighted according to theweight argument.

patch_codes

(optional, default=NULL) An integer value or vectorspecifying the codes corresponding to the habitat pixel whose correspondingpatches are included to compute the capacity as the area of the habitatwhenmode='area'. Patches corresponding to other initial habitatcodes are weighted by 0.

exp

An integer value specifying the power to which patch area areraised whenmode='area'. When not specified,exp=1 by default.

ext_file

A character string specifying the name of the .csv file inwhich patch capacities are stored. It must be located either in the workingdirectory or in the directory defined byproj_path. It must haveas many rows as there are patches in the project and its column namesmust include 'Id' and 'Capacity'. The 'Id' column must correspond to thepatch ID in the 'patches' layer (seeget_graphab_metric).The 'Capacity' column must contain the corresponding patch capacities toassign each patch.

thr

(optional, default=NULL) An integer or numeric value indicatingthe maximum distance in cost distance units (except whencost_conv = TRUE) at which cells are considered for computing thecapacity whenmode='neigh'.

linkset

(optional, default=NULL) A character string indicating thename of the link set used to take distance into account when computingthe capacity. Only used whenmode='neigh'. Link sets can becreated withgraphab_link.

codes

An integer value or a vector of integer values specifying thecodes of the raster cells taken into account when computing the capacity inthe neighbourhood of the patches, whenmode='neigh'.

cost_conv

FALSE (default) or TRUE. Logical indicating whether numericthr values are converted from cost-distance into Euclidean distanceusing a log-log linear regression. See alsoconvert_cdfunction. Only used whenmode='neigh'.

weight

A logical indicating whether the cells are weighted by aweight decreasing with the distance from the patches (TRUE) or not (FALSE).The weights follow a negative exponential decline such thatwi = exp(-alpha*di), where wi is the weight of cell i, di its distance fromthe patch and alpha a parameter determined such that wi = 0.05 when di = thr.

proj_path

alloc_ram

(optional, default = NULL) Integer or numeric valueindicating RAM gigabytes allocated to the java process. Increasing thisvalue can speed up the computations. Too large values may not be compatiblewith your machine settings.

Details

See more information in Graphab 2.8 manual:https://sourcesup.renater.fr/www/graphab/download/manual-2.8-en.pdfBe careful, when capacity has been changed. The last changes are taken intoaccount for subsequent calculations in a project.

Author(s)

P. Savary

Examples

## Not run: graphab_capacity(proj_name = "grphb_ex",                 mode = "area")## End(Not run)

Computes corridors from least-cost paths already computed inthe Graphab project

Description

The function computes corridors around the least-cost pathswhich have been computed in the Graphab project.

Usage

graphab_corridor(  proj_name,  graph,  maxcost,  format = "raster",  cost_conv = FALSE,  proj_path = NULL,  alloc_ram = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is. It can be created withgraphab_project

graph

A character string indicating the name of the graph with thelinks from which the corridors are computed.This graph has been created with Graphab or usinggraphab_graphfunction and is associated with a link set.Only the links present in the graph will be used in the computation.

maxcost

An integer or numeric value indicating the maximum costdistance from the least-cost paths considered for creating the corridors,in cost distance units (except whencost_conv = TRUE).

format

(optional, default = "raster") A character string indicatingwhether the output is a raster file or a shapefile layer.

cost_conv

proj_path

alloc_ram

Details

Author(s)

P. Savary

Examples

## Not run: graphab_corridor(proj_name = "grphb_ex",                 graph = "graph",                 maxcost = 1000,                 format = "raster",                 cost_conv = FALSE)## End(Not run)

Create a graph in the Graphab project

Description

The function creates a graph from a link set in a Graphab project

Usage

graphab_graph(  proj_name,  linkset = NULL,  name = NULL,  thr = NULL,  cost_conv = FALSE,  proj_path = NULL,  alloc_ram = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is. It can be created withgraphab_project

linkset

(optional, default=NULL) A character string indicating thename of the link set used to create the graph. Iflinkset=NULL, everylink set present in the project will be used to create a graph. Link setscan be created withgraphab_link.

name

(optional, default=NULL) A character string indicating thename of the graph created. Ifname=NULL, a name will be created. Ifbothlinkset=NULL andname=NULL, then a graph will be createdfor every link set present in the project and a name will be created everytime. In the latter case, a unique name cannot be specified. Link setscan be created withgraphab_link.

thr

(optional, default=NULL) An integer or numeric value indicatingthe maximum distance associated with the links of the created graph. Itallows users to create a pruned graph based on a distance threshold. Note thatwhen the link set used has a planar topology, the graph is necessarily apruned graph (not complete) and adding this threshold parameter can removeother links. When the link set has been created with cost-distances, theparameter is expressed in cost-distance units whereas when the link set isbased upon Euclidean distances, the parameter is expressed in meters.

cost_conv

FALSE (default) or TRUE. Logical indicating whether numericthr values are converted from cost-distance into Euclidean distanceusing a log-log linear regression. See alsoconvert_cdfunction.

proj_path

alloc_ram

Details

By default, intra-patch distances are considered for metriccalculation. See more information in Graphab 2.8 manual:https://sourcesup.renater.fr/www/graphab/download/manual-2.8-en.pdf

Author(s)

P. Savary

Examples

## Not run: graphab_graph(proj_name = "grphb_ex",              linkset = "lcp",              name = "graph")## End(Not run)

Creates a raster with interpolated connectivity metric values from metricsalready computed in the Graphab project

Description

The function creates a raster with interpolated connectivitymetric values from a metric already computed in the Graphab project.

Usage

graphab_interpol(  proj_name,  name,  reso,  linkset,  graph,  var,  dist,  prob = 0.05,  thr = NULL,  summed = FALSE,  proj_path = NULL,  alloc_ram = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is. It can be created withgraphab_project

name

A character string indicating the name of the raster to becreated after the interpolation.

reso

An integer indicating the spatial resolution in meters of theraster resulting from the metric interpolation.

linkset

A character string indicating the name of the link set usedfor the interpolation. It should be the one used to create the used graphand the metric.

graph

A character string indicating the name of the graph from whichthe metric was computed and whose links are considered for a potentialmulti-linkage with patches.This graph has been created with Graphab or usinggraphab_graphfunction and is associated with a link set.

var

A character string indicating the name of the already computedmetric to be interpolated.

dist

A numeric or integer value specifying the distance at which weassume a probability equal toprob during the interpolation.It is used to set\alpha for computing probabilities associatedwith distances between each pixel and the neighboring patch(es) such thatprobability between patch i and pixel j isp_{ij}= e^{-\alpha d_{ij}}.

prob

A numeric or integer value specifying the probabilityat distancedist. By default,code=0.05. It is used to set\alpha (see paramdist above).

thr

(default NULL) If NULL, the value of each pixel is computed fromthe value of the metric at the nearest habitat patch, weighted by aprobability depending on distance. If an integer, the value of each pixeldepends on the values of the metric taken at several of the nearest habitatpatches, up to a distance (cost or Euclidean distance, depending on the typeof linkset) equal tothr.

summed

Logical (default = FALSE) only used ifthr is not NULL,and specifying whether multiple values are summed up (TRUE) or averagedafter being weighted by probabilities.

proj_path

alloc_ram

Details

Author(s)

P. Savary

Examples

## Not run: graphab_interpol(proj_name = "grphb_ex",                 name = "F_interp",                 reso = 20,                 linkset = "lcp",                 graph = "graph",                 var = "F_d600_p0.5_beta1_graph",                 dist = 600,                 prob = 0.5)## End(Not run)

Create a link set in the Graphab project

Description

The function creates a link set between habitat patches in theGraphab project.

Usage

graphab_link(  proj_name,  distance = "cost",  name,  cost = NULL,  topo = "planar",  remcrosspath = FALSE,  proj_path = NULL,  alloc_ram = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is. It can be created withgraphab_project

distance

A character string indicating whether links between patchesare computed based on:

Shortest cost distances:distance='cost' (default)
Straight Euclidean distances:distance='euclid'

In the resulting link set, each link will be associated with itscorresponding cost-distance and the length of the least-cost path in meters(ifdistance='cost') or with its length in Euclidean distance(ifdistance='euclid')

name

A character string indicating the name of the created linkset.

cost

This argument could be:

Adata.frame indicating the cost values associated to eachraster cell value. These values refer to the raster used to create theproject withgraphab_project. The data.frame must have twocolumns:
- 'code': raster cell values
- 'cost': corresponding cost values
The path to an external raster file in .tif format with cost values.

topo

A character string indicating the topology of the createdlink set. It can be:

Planar (topo='planar' (default)): a planar set of links iscreated. It speeds up the computation but will prevent from creatingcomplete graphs withgraphab_graph.
Complete (topo='complete'): a complete set of links is created.A link is computed between every pair of patches.

remcrosspath

(optional, default = FALSE) A logical indicating whetherlinks crossing patches are removed (TRUE).

proj_path

alloc_ram

Details

By default, links crossing patches are not ignored nor broken intotwo links. For example, a link from patches A to C crossing patch Bis created. It takes into account the distance inside patch B. It can be aproblem when computing BC index. See more information in Graphab 2.8 manual:https://sourcesup.renater.fr/www/graphab/download/manual-2.8-en.pdf

Author(s)

P. Savary, T. Rudolph

Examples

## Not run: df_cost <- data.frame(code = 1:5,                      cost = c(1, 10, 100, 1000, 1))graphab_link(proj_name = "grphb_ex",            distance = "cost",            name = "lcp",            cost = df_cost,            topo = "complete")## End(Not run)

Compute connectivity metrics from a graph in the Graphab project

Description

The function computes connectivity metrics on a graph from alink set in a Graphab project

Usage

graphab_metric(  proj_name,  graph,  metric,  multihab = FALSE,  dist = NULL,  prob = 0.05,  beta = 1,  cost_conv = FALSE,  return_val = TRUE,  proj_path = NULL,  alloc_ram = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is.

graph

A character string indicating the name of the graph on whichthe metric is computed. This graph has been created with Graphabor usinggraphab_graph function and is associatedwith a link set. Only the links present in the graph and their correspondingweights will be used in the computation, together with patch areas.

metric

A character string indicating the metric which will be computedon the graph. This metric can be:

A global metric:
- Probability of Connectivity (metric = 'PC'): Sum of products ofarea of all pairs of patches weighted by their interaction probability,divided by the square of the area of the study zone.This ratio is the equivalent to the probability that two points randomlyplaced in the study area are connected.
- Equivalent Connectivity (metric = 'EC'): Square root of thesum of products of capacity of all pairs of patches weighted by theirinteraction probability. This is the size of a single patch (maximallyconnected) that would provide the same probability of connectivity as theactual habitat pattern in the landscape (Saura et al., 2011).
- Integral Index of Connectivity (metric = 'IIC'): For theentire graph: product of patch areas divided by the number of linksbetween them, the sum is divided by the square of the area of the studyzone. IIC is built like the PC index but using the inverse of a topologicaldistance rather than a negative exponential function of the distancebased on the link weight.
A local metric:
- Flux (metric = 'F'): For the focal patch i : sum of areaof patches other than i and weighted according to their minimum distanceto the focal patch through the graph. This sum is an indicator of thepotential dispersion from the patch i or, conversely to the patch i
- Betweenness Centrality index (metric = 'BC'): Sum of theshortest paths through the focal patch i, each path is weighted by theproduct of the areas of the patches connected and of their interactionprobability. All possible paths between every pair of patches isconsidered in this computation.
- Interaction Flux (metric = 'IF'): Sum of products of the focalpatch area with all the other patches, weighted by their interactionprobability.
- Degree (metric = 'Dg'): Number of edges connected to thenode i i.e. number of patches connected directly to the patch i.
- Closeness Centrality index (metric = 'CCe'): Mean distancefrom the patch i to all other patches of its component k.
- Current Flux (metric = 'CF'): Sum of currents passing throughthe patch i.c_{i}^{j} represents the current through the patch i whencurrents are sent from all patches (except j) to the patch j.The patch j is connected to the ground.
A delta metric:
- delta Probability of Connectivity (metric = 'dPC'): Rate ofvariation between the value of PC index and the value of PC' correspondingto the removal of the patch i. The value ofdPC is decomposedinto three parts:
  - dPC_{area} is the variation induced by the area lost after removal;
  - dPC_{flux} is the variation induced by the loss of interactionbetween the patch i and other patches;
  - dPC_{connector} is the variation induced by the modification ofpaths connecting other patches and initially routed through i.

For most metrics, the interaction probability is computed for each pair ofpatches from the path that minimizes the distance d (or the cost) betweenthem. It then maximizes{e}^{-\alpha d_{ij}} for patches i and j.To use patch capacity values different from the patch area, please usedirectly Graphab software.

multihab

A logical (default = FALSE) indicating whether the'multihabitat' mode is used when computing the metric. It only applies tothe following metrics: 'EC', 'F', 'IF' and 'BC'. If TRUE, then the projectmust have been created with the optionnomerge=TRUE. It then returnsseveral columns with metric values including the decomposition of thecomputation according to the type of habitat of every patch.Be careful, this option is in development and we cannot guarantee theresults are correct.

dist

A numeric or integer value specifying the distance at whichdispersal probability is equal toprob. This argument is mandatoryfor weighted metrics (PC, F, IF, BC, dPC, CCe, CF) but not used for others.It is used to set\alpha for computing dispersal probabilities associatedwith all inter-patch distances such that dispersal probability betweenpatches i and j isp_{ij}= e^{-\alpha d_{ij}}.

prob

A numeric or integer value specifying the dispersal probabilityat distancedist. By default,code=0.05. It is used to set\alpha (see paramdist above).

beta

A numeric or integer value between 0 and 1 specifying theexponent associated with patch areas in the computation of metricsweighted by patch area. By default,beta=1. Whenbeta=0, patchareas do not have any influence in the computation.

cost_conv

FALSE (default) or TRUE. Logical indicating whether numericdist values are converted from cost-distance into Euclidean distanceusing a log-log linear regression. See alsoconvert_cdfunction.

return_val

Logical (default = TRUE) indicating whether metric valuesare returned in R (TRUE) or only stored in the patch attribute layer (FALSE)

proj_path

alloc_ram

Details

The metrics are described in Graphab 2.8 manual:https://sourcesup.renater.fr/www/graphab/download/manual-2.8-en.pdfGraphab software makes possible the computation of other metrics.Be careful, when the same metric is computed several times, the optionreturn=TRUE is not returning the right columns. In these cases,useget_graphab_metric.

Value

Ifreturn_val=TRUE, the function returns adata.framewith the computed metric values and the corresponding patch ID when themetric is local or delta metric, or the numeric value of the global metric.

Author(s)

P. Savary

Examples

## Not run: graphab_metric(proj_name = "grphb_ex",               graph = "graph",               metric = "PC",               dist = 1000,               prob = 0.05,               beta = 1)## End(Not run)

Create modules from a graph in the Graphab project

Description

The function creates modules from a graph by maximisingmodularity

Usage

graphab_modul(  proj_name,  graph,  dist,  prob = 0.05,  beta = 1,  nb = NULL,  return = TRUE,  proj_path = NULL,  alloc_ram = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is.

graph

A character string indicating the name of the graph on whichthe modularity index is computed. This graph has been created with Graphabor usinggraphab_graph function and is associatedwith a link set. Only the links present in the graph and their correspondingweights will be used in the computation, together with patch areas.

dist

prob

A numeric or integer value specifying the dispersal probabilityat distancedist. By default,code=0.05. It is used to set\alpha (see paramdist above).

beta

nb

(optional, default=NULL) An integer or numeric value indicatingthe number of modules to be created. By default, it is the number thatmaximises the modularity index.

return

Logical (default=TRUE) indicating whether results are returnedto user.

proj_path

alloc_ram

Details

This function maximises a modularity index by searching for thenode partition involves a large number of links within modules and a smallnumber of inter-module links. Each link is given a weight in the computation,such as the weightw_{ij} of the link between patches i and j is:

w_{ij} = (a_{i} a_{j})^\beta e^{-\alpha d_{ij}}

.This function does not allow users to convert automatically Euclideandistances into cost-distances.See more information in Graphab 2.8 manual:https://sourcesup.renater.fr/www/graphab/download/manual-2.8-en.pdf

Value

Ifreturn=TRUE, the function returns a message indicatingwhether the partition has been done. New options are being developed.

Author(s)

P. Savary

Examples

## Not run: graphab_modul(proj_name = "grphb_ex",               graph = "graph",               dist = 1000,               prob = 0.05,               beta = 1)## End(Not run)

Add a point set to the Graphab project

Description

The function adds a spatial point set to the Graphab project,allowing users to identify closest habitat patch from each point andget corresponding connectivity metrics.

Usage

graphab_pointset(  proj_name,  linkset,  pointset,  id = "ID",  return_val = TRUE,  proj_path = NULL,  alloc_ram = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is.

linkset

A character string indicating the name of the link set used.The link set is here used to get the defined cost values and compute thedistance from the point to the patches. Link sets can be createdwithgraphab_link.

pointset

Can be either;

A character string indicating the path (absolute or relative) to ashapefile point layer
A character string indicating the path to a .csv file with threecolumns: ID, x and y, respectively indicating the point ID, longitudeand latitude.
A data.frame with three columns:ID, x and y, respectively indicating the point ID, longitude and latitude.
A SpatialPointsDataFrame

The point ID column must be 'ID' by default but can also be specifiedby theid argument in all three cases.

id

A character string indicating the name of the column in eitherthe .csv table, data.frame or attribute table, corresponding to the IDof the points. By default, it should be 'ID'. This column is used for namingthe points when returning the output.

return_val

Logical (default=TRUE) indicating whether the metricsassociated with closest habitat patches from the points are returned tousers.

proj_path

alloc_ram

Details

Point coordinates must be in the same coordinate reference systemas the habitat patches (and initial raster layer). See more information inGraphab 2.8 manual:https://sourcesup.renater.fr/www/graphab/download/manual-2.8-en.pdf

Value

Ifreturn_val=TRUE, the function returns adata.framewith the properties of the nearest patch to every point in the point set,as well as the distance from each point to the nearest patch.

Author(s)

P. Savary

Examples

## Not run: graphab_pointset(proj_name = "grphb_ex",               graph = "graph",               pointset = "pts.shp")## End(Not run)

Create a Graphab project

Description

The function creates a Graphab project from a raster file onwhich habitat patches can be delimited.

Usage

graphab_project(  proj_name,  raster,  habitat,  nomerge = FALSE,  minarea = 0,  nodata = NULL,  maxsize = NULL,  con8 = FALSE,  alloc_ram = NULL,  proj_path = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml will be created.

raster

A character string indicating the name of the .tif raster fileor of its path. If the path is not specified, the raster must be present inthe current working directory. Raster cell values must be in INT2S encoding.

habitat

An integer or numeric value or vector indicating thecode.s (cell value.s) of the habitat cells in the raster file.

nomerge

(optional, default=FALSE) A logical indicating whethercontiguous patches corresponding to different pixel codes are merged(FALSE, default) or not merged (TRUE).Be careful, thenomerge = TRUE option is in development and we cannotguarantee the results are correct.

minarea

(optional, default=0) An integer or numeric value specifiyingthe minimum area in hectares for a habitat patch size to become a graph node.

nodata

(optional, default=NULL) An integer or numeric valuespecifying the code in the raster file associated with nodata value(often corresponding to peripheric cells)

maxsize

(optional, default=NULL) An integer or numeric valuespecifying the maximum side length of the rectangular full extent of eachhabitat patch in metric units. If this side length exceedsmaxsize m,then several patches are created.(often corresponding to peripheric cells)

con8

(optional, default=FALSE) A logical indicating whether aneighborhood of 8 pixels (TRUE) is used for patch definition. By default,con8=4, corresponding to 4 pixel neighborhood.

alloc_ram

proj_path

Details

A habitat patch consists of the central pixel with its eightneighbors if they are of the same value (8-connexity) and the pathgeometry is not simplified. See more information in Graphab 2.8 manual:https://sourcesup.renater.fr/www/graphab/download/manual-2.8-en.pdf

Author(s)

P. Savary, T. Rudolph

Examples

## Not run: proj_name <- "grphb_ex"raster <- "rast_ex.tif"habitat <- 5graphab_project(proj_name = proj_name,               raster = raster,               habitat = habitat)## End(Not run)

Describe the objects of a Graphab project

Description

The function describes the objects of a Graphab project

Usage

graphab_project_desc(  proj_name,  mode = "patches",  linkset = NULL,  proj_path = NULL,  fig = FALSE,  return_val = TRUE)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is.

mode

A character string indicating the objects of the project thatare described. It must be either:

mode='patches'(default): The habitat patches are describedwith synthetic descriptors (code, number, mean capacity, median capacity,capacity harmonic mean, capacity Gini coefficient) and a histogram ofcapacity distribution.
mode='linkset': The links of a link set are describedwith synthetic descriptors (codes, costs, number, mean cost distance,median cost distance, cost distance harmonic mean, cost distance Ginicoefficient) and a histogram of cost distance distribution.
mode='both': Both the patches and links of a linkset aredescribed

linkset

A character string indicating the name of the link setwhose properties are imported. The link set has been created with Graphabor usinggraphab_link function.

proj_path

fig

Logical (default = FALSE) indicating whether to plot a figure ofthe resulting spatial graph. The figure is plotted using functionplot_graph_lg. The plotting can be long if the graph has manynodes and links.

return_val

Logical (default = TRUE) indicating whether the projectfeatures are returned as a list (TRUE) or only displayed in theR console (FALSE).

Author(s)

P. Savary

Examples

## Not run: graphab_project_desc(proj_name = "grphb_ex",                     mode = "patches",                     fig = FALSE)## End(Not run)

Create landscape graphs from Graphab link set

Description

The function creates a landscape graph from a link set createdwith Graphab software or different functions of this package and convertsit into a graph object of classigraph.The graph has weighted links and is undirected.Nodes attributes present in the Graphab project are included, includingconnectivity metrics when computed

Usage

graphab_to_igraph(  proj_name,  linkset,  nodes = "patches",  weight = "cost",  proj_path = NULL,  fig = FALSE,  crds = FALSE)

Arguments

proj_name

A character string indicating the project name. It is alsothe name of the directory in which proj_name.xml file is found. By default,'proj_name' is searched into the current working directory

linkset

A character string indicating the name of the linkset used tocreate the graph links. The linkset must have been created previously (seethe functiongraphab_link). It can be complete or planar. Thegraph is given the topology of the selected link set.

nodes

A character string indicating whether the nodes of the createdgraph are given all the attributes or metrics computed in Graphab or onlythose specific to a given graph previously created withgraphab_graphIt can be:

nodes = "patches"(default): all the attributes and metrics ofthe habitat patches are included as node attributes inigraph object.
nodes = "graph_name"(default): only the metrics ofthe habitat patches computed from the graph 'graph_name' created withgraphab_graph are included as node attributes inigraph object, along with some basic patch attributes.

weight

A character string ("euclid" or "cost") indicatingwhether to weight the links with Euclidean distance orcost-distance (default) values.

proj_path

(optional) A character string indicating the path to thedirectory that contains the project directory ('proj_name'). By default,'proj_name' is searched into the current working directory

fig

crds

Logical (default = FALSE) indicating whether to create an objectof classdata.frame with the node centroid spatial coordinates. Such adata.frame has 3 columns: 'ID', 'x', 'y'.

Value

A graph object of classigraph (if crds = FALSE) or alist of objects: a graph object of classigraph and adata.frame with the nodes spatial coordinates (if crds = TRUE).

Author(s)

P. Savary

References

Foltête J, Clauzel C, Vuidel G (2012).“A software tool dedicated to the modelling of landscape networks.”Environmental Modelling & Software,38, 316–327.

Examples

## Not run: proj_path <- system.file('extdata',package='graph4lg')proj_name <- "grphb_ex"linkset <- "lkst1"nodes <- "graph"graph <- graphab_to_igraph(proj_name = proj_name,                           linkset = "lkst1",                           nodes = "graph",                           links = links,                           weights = "cost",                           proj_path = proj_path,                           crds = FALSE,                           fig = FALSE)                           ## End(Not run)

Convert a file fromgstudio orpopgraph into a genind object

Description

The function converts a file formatted to usegstudio orpopgraph package into a genind object (adegenet package)

Usage

gstud_to_genind(x, pop_col, ind_col = NULL)

Arguments

x

An object of classdata.frame with loci columns informatlocus (defined in packagegstudio) with as many rows asindividuals and as many columns in formatlocus as there are loci andadditional columns

pop_col

A character string indicating the name of the column withpopulations' names inx

ind_col

(optional) A character string indicating the name of thecolumn with individuals' ID inx

Details

This function uses functions frompegas package.It can handle genetic data where alleles codings do not have same length,(99:101, for example).If the names of the loci include '.' characters, they willbe replaced by '_'.

Value

An object of classgenind.

Author(s)

P. Savary

Examples

data("data_ex_gstud")x <- data_ex_gstudpop_col <- "POP"ind_col <- "ID"data_genind <- gstud_to_genind(x, pop_col, ind_col)

Compute the harmonic mean of a numeric vector

Description

The function computes the harmonic mean of a numeric vector

Usage

harm_mean(x)

Arguments

x

A numeric vector

Value

A numeric value corresponding to the harmonic mean of the vector

Author(s)

P. Savary

Examples

x <- c(10, 2, 5, 15)hm <- harm_mean(x)

Compute dispersal kernel parameters

Description

The function computes the constant parameters of a dispersalkernel with a negative exponential distribution

Usage

kernel_param(p, d_disp, mode = "A")

Arguments

p

A numeric value indicating the dispersal probability at a distanceequal to 'd_disp' under a negative exponential distribution.

d_disp

A numeric value indicating the distance to which dispersalprobability is equal to 'p' under a negative exponential distribution.

mode

A character string indicating the value to return:

If 'mode = 'A” (default), the returned value 'alpha' is such thatexp(-alpha * d_disp) = p
If 'mode = 'B”, the returned value 'alpha' is such that10(-alpha * d_disp) = p

Details

If the resulting parameter when mode = "A" is a and the resultingparameter when mode = "B" is b, then we have:p = exp(-a.d_disp) = 10^(-b.d_disp) and a = b.ln(10)

Value

A numeric value

Author(s)

P. Savary

Examples

p <- 0.5d_disp <- 3000alpha <- kernel_param(p, d_disp, mode = "A")

Compare two link sets created in a Graphab project

Description

The function compares two link sets created in a Graphab projectboth quantitatively and spatially.

Usage

link_compar(  proj_name,  linkset1,  linkset2,  buffer_width = 200,  min_length = NULL,  proj_path = NULL)

Arguments

proj_name

A character string indicating the Graphab project name.The project name is also the name of the project directory in which thefile proj_name.xml is. It can be created withgraphab_project

linkset1

A character string indicating the name of the first link setinvolved in the comparison. The link set has to be present in the projectand can be created withgraphab_link.

linkset2

A character string indicating the name of the second link setinvolved in the comparison. The link set has to be present in the projectand can be created withgraphab_link.

buffer_width

(default=200) An integer or numeric valueindicating the width of the buffer created in each side of the links priorto spatial intersection. It is expressed in meters.

min_length

(optional, default=NULL) An integer or numeric valueindicating the minimum length in meters of the links to be compared. Linkswhose length is larger thanmin_length will be ignored inthe comparison.

proj_path

(optional, default=NULL) A character string indicating thepath to the directory that contains the project directory. It should be usedwhen the project directory is not in the current working directory.Default is NULL. When 'proj_path = NULL', the project directory is equaltogetwd().

Details

The function compares two link sets linking the same habitat patchesof the Graphab project but computed using different cost scenarios. Itcreates a buffer in each side of every link and then overlaps every linkin linkset1 with the same link in linkset2. It returns the area of bothbuffered links and the area of their intersection. It also computes theMantel correlation coefficient between the cost distances associated to thesame links in both linksets.

Author(s)

P. Savary

Examples

## Not run: link_compar(proj_name = "grphb_ex",              linkset1 = "lcp1",              linkset2 = "lcp2"              buffer_width = 200)## End(Not run)

Convert a loci object into a genind object

Description

This function is exactly the same asloci2genindfrompegas package

Usage

loci_to_genind(x, ploidy = 2, na.alleles = c("NA"))

Arguments

x

An object of classloci to convert

ploidy

An integer indicating the ploidy level(by default, 'ploidy = 2')

na.alleles

A character vector indicating the coding of the allelesto be treated as missing data (by default, 'na.alleles = c("NA")')

Value

An object of classgenind

Author(s)

P. Savary

Examples

data("data_ex_loci")genind <- loci_to_genind(data_ex_loci, ploidy = 2, na.alleles = "NA")

Compute cost distances between points on a raster

Description

The function computes cost-distances associated to least costpaths between point pairs on a raster with specified cost values.

Usage

mat_cost_dist(  raster,  pts,  cost,  method = "gdistance",  return = "mat",  direction = 8,  parallel.java = 1,  alloc_ram = NULL)

Arguments

raster

A parameter indicating the raster file on which cost distancesare computed. It can be:

A character string indicating the path to a raster file in format.tif or .asc.
ARasterLayer object already loaded in R environment

All the raster cell values must be present in the column 'code' fromcost argument.

pts

A parameter indicating the points between which cost distancesare computed. It can be either:

A character string indicating the path to a .csv file. It must havethree columns:
- ID: The ID of the points.
- x: A numeric or integer indicating the longitude of the points.
- y: A numeric or integer indicating the latitude of the points.
Adata.frame with the spatial coordinates of the points.It must have three columns:
- ID: The ID of the points.
- x: A numeric or integer indicating the longitude of the points.
- y: A numeric or integer indicating the latitude of the points.
ASpatialPointsDataFrame with at least an attribute columnnamed "ID" with the point IDs.

The point coordinates must be in the same spatial coordinate reference systemas the raster file.

cost

Adata.frame indicating the cost values associated to eachraster value. It must have two columns:

'code': raster cell values
'cost': corresponding cost values

method

A character string indicating the method used to compute thecost distances. It must be:

'gdistance': uses the functions from the packagegdistanceassuming that movement is possible in 8 directions from each cell, thata geo-correction is applied to correct for diagonal movement lengths and thatraster cell values correspond to resistance (and not conductance).
'java': uses a .jar file which is downloaded on the user's machine ifnecessary and if java is installed. This option substantially reducescomputation times and makes possible the parallelisation.

return

A character string indicating whether the returned object is adata.frame (return="df") or a pairwisematrix (return="mat").

direction

An integer (4, 8, 16) indicating the directions in whichmovement can take place from a cell. Only used whenmethod="gdistance".By default,direction=8.

parallel.java

An integer indicating how many computer cores are usedto run the .jar file. By default,parallel.java=1.

alloc_ram

(optional, default = NULL) Integer or numeric valueindicating RAM gigabytes allocated to the java process when used. Increasingthis value can speed up the computations. Too large values may not becompatible with your machine settings.

Value

The function returns:

Ifreturn="mat", a pairwisematrix with cost-distancevalues between points.
Ifreturn="df", an object of typedata.frame with three columns:
- from: A character string indicating the ID of the point of origin.
- to: A character string indicating the ID of the point of destination.
- cost_dist: A numeric indicating the accumulated cost-distance alongthe least-cost path between point ID1 and point ID2.

Author(s)

P. Savary

Examples

## Not run: x <- raster::raster(ncol=10, nrow=10, xmn=0, xmx=100, ymn=0, ymx=100)raster::values(x) <- sample(c(1,2,3,4), size = 100, replace = TRUE)pts <- data.frame(ID = 1:4,                  x = c(10, 90, 10, 90),                  y = c(90, 10, 10, 90))cost <- data.frame(code = 1:4,                   cost = c(1, 10, 100, 1000))mat_cost_dist(raster = x,              pts = pts, cost = cost,              method = "gdistance")## End(Not run)

Compute a pairwise matrix of genetic distances between populations

Description

The function computes a pairwise matrix of genetic distancesbetween populations and allows to implement several formula.

Usage

mat_gen_dist(x, dist = "basic", null_val = FALSE)

Arguments

x

An object of classgenind that contains the multilocusgenotypes (format 'locus') of the individuals as well as their populations.

dist

A character string indicating the method used to compute themultilocus genetic distance between populations

If 'dist = 'basic” (default), then the multilocus genetic distance iscomputed using a formula of Euclidean geneticdistance (Excoffier et al., 1992)
If 'dist = 'weight”, then the multilocus genetic distance is computedas in Fortuna et al. (2009). It is a Euclidean genetic distance giving moreweight to rare alleles
If 'dist = 'PG”, then the multilocus genetic distance is computed asin popgraph::popgraph function, following several steps of PCA and SVD(Dyer et Nason, 2004).
If 'dist = 'DPS”, then the genetic distance used is equal to1 - the proportion of shared alleles (Bowcock, 1994)
If 'dist = 'FST”, then the genetic distance used is the pairwiseFST (Weir et Cockerham, 1984)
If 'dist = 'FST_lin”, then the genetic distance used is the linearisedpairwise FST (Weir et Cockerham, 1984)(FST_lin = FST/(1-FST))
If 'dist = 'PCA”, then the genetic distance is computed following aPCA of the matrix of allelic frequencies by population. It is aEuclidean genetic distance between populations in the multidimensionalspace defined by all the independent principal components.
If 'dist = 'GST”, then the genetic distance used is theG'ST (Hedrick, 2005). See graph4lg <= 1.6.0 only, because it used diveRsity
If 'dist = 'D”, then the genetic distance used isJost's D (Jost, 2008). See graph4lg <= 1.6.0 only, because it used diveRsity

null_val

(optional) Logical. Should negative and null FST, FST_lin,GST or D values be replaced by half the minimum positive value?This option allows to compute Gabriel graphs from these "distances".Default is null_val = FALSE.This option only works if 'dist = 'FST” or 'FST_lin' or 'GST' or 'D'

Details

Negative values are converted into 0.Euclidean genetic distanced_{ij} between population i and jis computed as follows:

d_{ij}^{2} = \sum_{k=1}^{n} (x_{ki} - x_{kj})^{2}

wherex_{ki} is the allelic frequency of allele k in population i and n isthe total number of alleles. Note that when 'dist = 'weight”, the formulabecomes

d_{ij}^{2} = \sum_{k=1}^{n} (1/(K*p_{k}))(x_{ki} - x_{kj})^{2}

where K is the number of alleles at the locus of the allele k andp_{k}is the frequency of the allele k in all populations.Note that when 'dist = 'PCA”, n is the number of conserved independentprincipal components andx_{ki} is the value taken by the principalcomponent k in population i.

Value

An object of classmatrix

Author(s)

P. Savary

References

Bowcock AM, Ruiz-Linares A, Tomfohrde J, Minch E, Kidd JR, Cavalli-Sforza LL (1994).“High resolution of human evolutionary trees with polymorphic microsatellites.”nature,368(6470), 455–457.Excoffier L, Smouse PE, Quattro JM (1992).“Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data.”Genetics,131(2), 479–491.Dyer RJ, Nason JD (2004).“Population graphs: the graph theoretic shape of genetic structure.”Molecular ecology,13(7), 1713–1727.Fortuna MA, Albaladejo RG, Fernández L, Aparicio A, Bascompte J (2009).“Networks of spatial genetic variation across species.”Proceedings of the National Academy of Sciences,106(45), 19044–19049.Weir BS, Cockerham CC (1984).“Estimating F-statistics for the analysis of population structure.”evolution,38(6), 1358–1370.Hedrick PW (2005).“A standardized genetic differentiation measure.”Evolution,59(8), 1633–1638.Jost L (2008).“GST and its relatives do not measure differentiation.”Molecular ecology,17(18), 4015–4026.

Examples

data(data_ex_genind)x <- data_ex_genindD <- mat_gen_dist(x = x, dist = "basic")

Compute Euclidean geographic distances between points

Description

The function computes Euclidean geographic distance betweenpoints given their spatial coordinates either in a metric projectedCoordinate Reference System or in a polar coordinates system.

Usage

mat_geo_dist(  data,  ID = NULL,  x = NULL,  y = NULL,  crds_type = "proj",  gc_formula = "vicenty")

Arguments

data

An object of class :

data.frame with 3 columns: 2 columns with the point spatialcoordinates and another column with point IDs
SpatialPointsDataFrame

ID

(ifdata is of classdata.frame) A character stringindicating the name of the column ofdata with the point IDs

x

(ifdata is of classdata.frame) A character stringindicating the name of the column ofdata with the point longitude

y

(ifdata is of classdata.frame) A character stringindicating the name of the column ofdata with the point latitude

crds_type

A character string indicating the type of coordinatereference system:

'proj' (default): a projected coordinate reference system
'polar': a polar coordinate reference system, such as WGS84

gc_formula

A character string indicating the formula used to computethe Great Circle distance:

'vicenty'(default): Vincenty inverse formula for ellipsoids
'slc': Spherical Law of Cosines
'hvs': Harversine formula

Details

When a projected coordinate reference system is used, it calculatesclassical Euclidean geographic distance between two points usingPythagora's theorem. When a polar coordinate reference system is used, itcalculates the Great circle distance between points using different methods.Unlessmethod = "polar", whendata is adata.frame,it assumes projected coordinates by default.

Value

A pairwise matrix of geographic distances between points in meters

Author(s)

P. Savary

Examples

# Projected CRSdata(pts_pop_simul)mat_dist <- mat_geo_dist(data=pts_pop_simul,             ID = "ID",             x = "x",             y = "y")#Polar CRScity_us <- data.frame(name = c("New York City", "Chicago",                               "Los Angeles", "Atlanta"),                      lat  = c(40.75170,  41.87440,                               34.05420,  33.75280),                      lon  = c(-73.99420, -87.63940,                              -118.24100, -84.39360))mat_geo_us <- mat_geo_dist(data = city_us,                           ID = "name", x = "lon", y = "lat",                           crds_type = "polar")

Compute a pairwise genetic distance matrix between populationsusing Bowcock et al. (1994) formula

Description

The function computes the pairwise DPS, a genetic distancebased on the proportion of shared alleles.

Usage

mat_pw_dps(x)

Arguments

x

An object of classgenind

Details

The formula used is inspired from MSA software :

D_{PS}=1-\frac{\sum_{d}^{D}\sum_{k}^{K}\min (f_{a_{kd}i},f_{a_{kd}j})}{D}

such asa_{kd} is the allelek at locusdD is the total number of lociK is the allele number at each locus\gamma_{a_{kd^{ij}}}=0 if individualsi andjdo not share allelea_{kd}\gamma_{a_{kd^{ij}}}=1 if one of individualsi andjhas a copy ofa_{kd}\gamma_{a_{kd^{ij}}}=2 if both individuals have 2 copiesofa_{kd} (homozygotes)f_{a_{kd}i} is allelea_{kd} frequency inindividuali (0, 0.5 or 1).More information in :Bowcock et al., 1994and Microsatellite Analyser software (MSA) manual.This function uses functions fromadegenet packageNote that in the paper of Bowcock et al. (1994), the denominator is 2D.But, in MSA software manual, the denominator is D.

Value

A pairwise matrix of genetic distances between populations

Author(s)

P. Savary

References

Examples

data("data_ex_genind")dist_bowcock <- mat_pw_dps(data_ex_genind)

Compute a pairwise FST matrix between populations

Description

The function computes the pairwise FST matrix betweenpopulations from an object of classgenind

Usage

mat_pw_fst(x)

Arguments

x

An object of classgenind

Details

The formula used is that of Weir et Cockerham (1984).This functions uses directly the functionpairwise.WCfstfromhierfstat.

Value

A pairwisematrix of FST with as many rows and columns asthere are populations in the input data.

Warnings

Negative values are converted into 0

Author(s)

P. Savary

References

Weir BS, Cockerham CC (1984).“Estimating F-statistics for the analysis of population structure.”evolution,38(6), 1358–1370.

Examples

## Not run: data("data_ex_genind")mat_fst <- mat_pw_fst(data_ex_genind)## End(Not run)

Vector of custom colors

Description

Vector of custom colors

Usage

mypalette

Format

An object of classcharacter of length 27.

Examples

mypalette[1]

Extract patch areas from a categorical raster

Description

The function extracts patch areas from a categorical raster

Usage

patch_areas(raster, class, edge_size = 0, neighborhood = 8, surf_min = 0)

Arguments

raster

A RasterLayer object corresponding to a categorical raster layer

class

An integer value or vector with the value(s) corresponding tothe code values of the raster layer within which points will be sampled.are computed.

edge_size

An integer value indicating the width of the edge(in meters) of the raster layer which is ignored during the sampling(default = 0). It prevents from sampling in the margins of the study area.

neighborhood

An integer value indicating which cells are consideredadjacent when contiguous patches are delineated (it should be 8(default, Queen's case) or 4 (Rook's case)). This parameter is ignoredwhenby_patch = FALSE.

surf_min

An integer value indicating the minimum surface of a patchconsidered for the sampling in number of raster cells. This parameter is usedwhatever theby_patch argument is. Default is 0.

Value

A data.frame with the areas of the patches

Author(s)

P. Savary

Plot graphs

Description

The function enables to plot graphs, whether spatial or not.

Usage

plot_graph_lg(  graph,  crds = NULL,  mode = "aspatial",  node_inter = NULL,  link_width = NULL,  node_size = NULL,  module = NULL,  pts_col = NULL)

Arguments

graph

A graph object of classigraph

crds

(optional, default = NULL) If 'mode = 'spatial”, it is adata.frame with the spatial coordinates of the graph nodes.It must have three columns :

ID: A character string indicating the name of the graph nodes.The names must be the same as the node names of the graph ofclassigraph (igraph::V(graph)$name)
x: A numeric or integer indicating the longitude of the graph nodes.
y: A numeric or integer indicating the latitude of the graph nodes.

This argument is not used when 'mode = 'aspatial” and mandatory when 'mode ='spatial”.

mode

A character string indicating whether the graph isspatial ('mode = 'spatial”) or not ('mode = 'aspatial” (default))

node_inter

(optional, default = NULL) A character string indicatingwhether the links of the graph are weighted by distances or by similarityindices. It is only used when 'mode = 'aspatial” to compute the nodepositions with Fruchterman and Reingold algorithm. It can be equal to:

'distance': Link weights correspond to distances. Nodes that are closeto each other will be close on the figure.
'similarity': Link weights correspond to similarity indices. Nodes thatare similar to each other will be close on the figure.

link_width

(optional, default = NULL) A character string indicatinghow the width of the link is set on the figure. Their width can be:

inversely proportional to link weights ("inv_w", convenient withdistances, default)
proportional to link weights ("w")

node_size

(optional, default = NULL) A character string indicatingthe graph node attribute used to set the node size on the figure. It must bethe name of a numeric or integer node attribute from the graph.

module

(optional, default = NULL) A character string indicatingthe graph node modules used to set the node color on the figure. It must bethe name of a node attribute from the graph with discrete values.

pts_col

(optional, default = NULL) A character string indicating thecolor used to plot the nodes (default: "#F2B950"). It must be a hexadecimalcolor code or a color used by default in R. It cannot be used if 'module' isspecified.

Details

When the graph is not spatial ('mode = 'aspatial”),the nodes coordinates are calculated with Fruchterman et Reingold algorithm.The graph objectgraph of classigraph must have node names(not necessarily in the same order as IDs in crds, given a merging is done).

Value

A ggplot2 object to plot

Author(s)

P. Savary

References

Fruchterman TM, Reingold EM (1991).“Graph drawing by force-directed placement.”Software: Practice and experience,21(11), 1129–1164.

Examples

data(pts_pop_ex)data(data_ex_genind)mat_w <- mat_gen_dist(data_ex_genind, dist = "DPS")gp <- gen_graph_topo(mat_w = mat_w, topo = "mst")g <- plot_graph_lg(graph = gp,                             crds = pts_pop_ex,                             mode = "spatial",                             link_width = "inv_w")

Plot histograms of link weights

Description

The function enables to plot histogram to visualize thedistribution of the link weights

Usage

plot_w_hist(graph, fill = "#396D35", class_width = NULL)

Arguments

graph

A graph object of classigraph whose links are weighted

fill

A character string indicating the color used to fillthe bars (default: "#396D35"). It must be a hexadecimal color code ora color used by default in R.

class_width

(default values: NULL) A numeric or an integer specifyingthe width of the classes displayed on the histogram. When it is notspecified, the width is equal to the difference between the minimum andmaximum values divided by 80.

Value

A ggplot2 object to plot

Author(s)

P. Savary

Examples

data(data_ex_genind)mat_w <- mat_gen_dist(data_ex_genind, dist = "DPS")gp <- gen_graph_topo(mat_w = mat_w, topo = "gabriel")hist <- plot_w_hist(gp)

Compute population-level genetic indices

Description

The function computes population-level genetic indices from anobject of classgenind.

Usage

pop_gen_index(x, pop_names = NULL, indices = c("Nb_ind", "A", "He", "Ho"))

Arguments

x

An object of classgenindfrom packageadegenet.

pop_names

(optional) A character vector indicating population names.It is of the same length as the number of populations. Without thisargument, populations are given the names they have initially in the'genind' object (which is sometimes only a number). The order of thepopulation names must match with their order in the 'genind' object.The function does not reorder them. Users must be careful.

indices

(optional) A character vector indicating the population-levelindices to compute. These indices can be:

Mean allelic richness by locus bypopulation (indices = c("A", ...))
Mean expected heterozygosity by locus bypopulation (indices = c("He",...))
Mean observed heterozygosity by locus bypopulation (indices = c("Ho",...))
Number of individuals bypopulation (indices = c("Nb_ind", ...))
Total allelic richness bypopulation (indices = c("A_tot",...))

By default,indices = c("Nb_ind", "A", "He", "Ho").

Value

An object of classdata.frame whose rowscorrespond to populations and columns to population attributes(ID, size, genetic indices). By default, the first column corresponds tothe population names (ID). The order of the columns depends on thevector 'indices'.

Author(s)

P. Savary

Examples

data(data_ex_genind)x <- data_ex_genindpop_names <- levels(x@pop)df_pop_indices <- pop_gen_index(x = x,                   pop_names = pop_names,                   indices = c("Nb_ind", "A"))

Compute population-level rarefied genetic indices with ADZE software

Description

The function computes population-level rarefied genetic indicesfrom an object of classgenind with the ADZE software.

Usage

pop_rare_gen_index(x, max_g = NULL, pop_names = NULL, OS = "linux")

Arguments

x

An object of classgenindfrom packageadegenet.

max_g

(optional default = NULL) The maximum standardized sample sizeused by ADZE software (MAX_G) in ADZE manual. It is equal to twice theminimum number of individuals considered for the rarefaction analysis. Bydefault, it is equal to twice the number of individuals in the smallestpopulation. Ohterwise, it must be either a numeric or integer value.

pop_names

OS

A character string indicating whether you use a Linux ('linux')or Windows ('win') operating system.

Value

Author(s)

P. Savary

pts_pop_ex : details on simulated populations

Description

Simulation dataset10 populations located on a simulated landscape

Usage

pts_pop_ex

Format

An object of class 'data.frame' with the following columns :

ID: Population ID of the 10 populations
x: Site longitude (RGF93)
y: Site latitude (RGF93)

References

Landguth EL, Cushman SA (2010).“CDPOP: a spatially explicit cost distance population genetics program.”Molecular Ecology Resources,10(1), 156–161.There are as many rows as there are sampled populations.

Examples

data("pts_pop_ex")str(pts_pop_ex)

pts_pop_simul : details on simulated populations

Description

Simulation dataset50 populations located on a simulated landscape

Usage

pts_pop_simul

Format

An object of class 'data.frame' with the following columns :

ID: Population ID of the 50 populations
x: Site longitude (RGF93)
y: Site latitude (RGF93)

References

Examples

data("pts_pop_simul")str(pts_pop_simul)

Convert a pairwise matrix into an edge-list data.frame

Description

The function converts a pairwise matrix into an edge-listdata.frame

Usage

pw_mat_to_df(pw_mat)

Arguments

pw_mat

A pairwise matrix which can be:

An object of classmatrix. It must havethe same row names and column names. If values represent distances,diagonal elements should be equal to 0.
An object of classdist. In that, its column numbers areused to create IDs in the resulting data.frame.

Value

An object of classdata.frame

Author(s)

P. Savary

Examples

data(data_tuto)pw_mat <- data_tuto[[1]]df <- pw_mat_to_df(pw_mat)

Reorder the rows and columns of a symmetric matrix

Description

The function reorders the rows and columns of a symmetricmatrix according to a specified order.

Usage

reorder_mat(mat, order)

Arguments

mat

An object of classmatrix

order

A character vector with the rows and columns names of the matrixin the order in which they will be ordered by the function. All its elementsmust be rows and columns names of the matrixmat.

Details

The matrixmat must be symmetric and have rows and columnsnames. Its values are not modified.

Value

A reordered symmetric matrix

Author(s)

P. Savary

Examples

mat <- matrix(rnorm(36), 6)mat[lower.tri(mat)] <- t(mat)[lower.tri(mat)]row.names(mat) <- colnames(mat) <- c("A", "C", "E", "B", "D", "F")order <- c("A", "B", "C", "D", "E", "F")mat <- reorder_mat(mat = mat, order = order)

Sample points or patches on a categorical raster layer

Description

The function samples points or patches on a categorical rasterlayer.

Usage

sample_raster(  raster,  class,  nb_pts,  dist_min = 0,  edge_size = 0,  by_patch = TRUE,  neighborhood = 8,  surf_min = 0,  prop_area = TRUE,  step_max = 1000,  output = "df",  desc = TRUE)

Arguments

raster

A RasterLayer object corresponding to a categorical raster layer

class

An integer value or vector with the value(s) corresponding tothe code values of the raster layer within which points will be sampled.

nb_pts

An integer value indicating the number of points to be sampled

dist_min

An integer value indicating the minimum distance separatingthe sampled points (default = 0).

edge_size

An integer value indicating the width of the edge of theraster layer which is ignored during the sampling (default = 0). It preventsfrom sampling in the margins of the study area.

by_patch

A logical value indicating whether contiguous patches withcells having the same code value are delineated prior tosampling (default = TRUE). It prevents from sampling several points in thesame contiguous patch.

neighborhood

surf_min

An integer value indicating the minimum surface of a patchconsidered for the sampling in number of raster cells. This parameter is usedwhatever theby_patch argument is. Default is 0.

prop_area

A logical value indicating whether sampling in large patchesis more likely (default = TRUE). Ifby_patch = FALSE, this parameteris ignored. Whenprop_area = TRUE, the probability to sample a givenpatch is proportional to its area.

step_max

An integer value indicating how many sampling steps areperformed to identify a point set satisfying all the conditions beforereturning an error.

output

A character string indicating the type of returned output:

'data.frame': Adata.frame with three/four columns:
- ID: The point or patch centroid ID
- x: The point or patch centroid longitude
- y: The point or patch centroid latitude
- area: The area of the sampled patch (only ifby_patch = TRUE)
'pts_layer': ASpatialPointsDataFrame layer correspondingto the sampled point (points or patch centroids)
'poly_layer': ASpatialPolygonsDataFrame layer correspondingto the sampled patch polygons

desc

A logical value indicating whether the result should bedescribed or not (default = FALSE). Ifdesc = TRUE, then the Ginicoefficient of the distances between points and of the patch areas (ifby_patch = TRUE) is computed with thegini_coeff.An histogram of the link weights is also described.

Value

A list of object(s) with one or several elements according to theoutput anddesc arguments.

Author(s)

P. Savary

Scaling function

Description

Scales values between 0 and 1

Usage

sc01(x)

Arguments

x

Numeric or integer vector

Examples

x <- runif(min = 3, max = 15, n = 20)x01 <- sc01(x)

Plot scatterplots of genetic distance vs landscape distance

Description

The function enables to plot scatterplots to visualize therelationship between genetic distance (or differentiation) and landscapedistance (Euclidean distance, cost-distance, etc.)between populations orsample sites.

Usage

scatter_dist(  mat_gd,  mat_ld,  method = "loess",  thr_gd = NULL,  thr_ld = NULL,  se = TRUE,  smooth_col = "black",  pts_col = "#999999")

Arguments

mat_gd

A symmetricmatrix ordist object with pairwisegenetic distances between populations or sample sites.

mat_ld

method

A character string indicating the smoothing methodused to fit a line on the scatterplot. Possible values are the same aswith function 'geom_smooth()' fromggplot2 : 'lm', 'glm', 'gam','loess' (default).

thr_gd

(optional) A numeric or integer value used to remove valuesfrom the data before to plot. All genetic distances values abovethr_gd are removed from the data.

thr_ld

(optional) A numeric or integer value used to remove valuesfrom the data before to plot. All landscape distances values abovethr_ld are removed from the data.

se

Logical (optional, default = TRUE) indicating whether theconfidence interval around the smooth line is displayed.

smooth_col

(optional) A character string indicating the colorused to plot the smoothing line (default: "blue"). It must be a hexadecimalcolor code or a color used by default in R.

pts_col

(optional) Character string indicating the colorused to plot the points (default: "#999999"). It must be a hexadecimal colorcode or a color used by default in R.

Details

IDs inmat_gd andmat_ld must be the same and referto the same sampling sites or populations, and both matrices must be orderedin the same way.Matrix of genetic distancemat_gd can be computed usingmat_gen_dist.Matrix of landscape distancemat_ld can be computed usingmat_geo_dist when the landscape distance needed is aEuclidean geographical distance.

Value

A ggplot2 object to plot

Author(s)

P. Savary

Examples

data(data_tuto)mat_dps <- data_tuto[[1]]mat_dist <- suppressWarnings(mat_geo_dist(data = pts_pop_simul,      ID = "ID",      x = "x",      y = "y"))mat_dist <- mat_dist[order(as.character(row.names(mat_dist))),                      order(as.character(colnames(mat_dist)))]scatterplot_ex <- scatter_dist(mat_gd = mat_dps,                              mat_ld = mat_dist)

Plot scatterplots of distances to visualize the graph pruning intensity

Description

The function enables to plot scatterplots of the relationshipbetween two distances (often a genetic distance and a landscape distancebetween populations or sample sites), while highlighting the population pairsbetween which a link was conserved during the creation of a graph whosenodes are populations (or sample sites). It thereby allows to visualize thegraph pruning intensity.

Usage

scatter_dist_g(  mat_y,  mat_x,  graph,  thr_y = NULL,  thr_x = NULL,  pts_col_1 = "#999999",  pts_col_2 = "black")

Arguments

mat_y

A symmetric (complete)matrix ordist object withpairwise (genetic or landscape) distances between populations or samplesites. These values will be the point coordinates on the y axis.mat_y is the matrix used to weight the links of the graphx,whose nodes correspond to row and column names ofmat_y.

mat_x

A symmetric (complete)matrix ordist object withpairwise (genetic or landscape) distances between populations or samplesites. These values will be the point coordinates on the x axis.mat_x andmat_y must have the same row and column names,ordered in the same way.

graph

A graph object of classigraph.Its nodes must have the same names as the row and column ofmat_y andmat_x matrices.x must have weighted links.Link weights have to be values frommat_y matrix.graph mustbe an undirected graph.

thr_y

(optional) A numeric or integer value used to remove valuesfrom the data before to plot. All values frommat_y abovethr_yare removed from the data.

thr_x

(optional) A numeric or integer value used to remove valuesfrom the data before to plot. All values frommat_x abovethr_xare removed from the data.

pts_col_1

(optional) A character string indicating the color used toplot the points associated to all populations or sample sitespairs (default: "#999999"). It must be a hexadecimal colorcode or a color used by default in R.

pts_col_2

(optional) A character string indicating the color used toplot the points associated to populations or sample sites pairs connected onthe graph (default: "black"). It must be a hexadecimal colorcode or a color used by default in R.

Details

IDs inmat_y andmat_x must be the same and referto the same sampling sites or populations, and both matrices must be orderedin the same way.Matrices of genetic distance can be computed usingmat_gen_dist.Matrices of landscape distance can be computed usingmat_geo_dist when the landscape distance needed is aEuclidean geographical distance.This function is based uponscatter_dist function.

Value

A ggplot2 object to plot

Author(s)

P. Savary

Examples

data(data_tuto)mat_gen <- data_tuto[[1]]mat_dist <- suppressWarnings(mat_geo_dist(data=pts_pop_simul,      ID = "ID",      x = "x",      y = "y"))mat_dist <- mat_dist[order(as.character(row.names(mat_dist))),                     order(as.character(colnames(mat_dist)))]x <- gen_graph_topo(mat_w = mat_gen, mat_topo = mat_dist, topo = "gabriel")scat <- scatter_dist_g(mat_y = mat_gen, mat_x = mat_dist,                       graph = x)

Convert a file in STRUCTURE format into a genind object

Description

The function converts a text file in STRUCTURE format intoa genind object to use in R

Usage

structure_to_genind(  path,  pop_names = NULL,  loci_names = NULL,  ind_names = NULL)

Arguments

path

A character string indicating the path to the STRUCTURE file informat .txt, or alternatively the name of the file in the working directory.The STRUCTURE file must only have :

A first column with the IDs of the individuals(can be a simple number)
A second column with the IDs of the populations(can be a simple number)
Some loci columns : as many columns as loci in the data

The row for loci names is optional but recommended.Each individual is displayed on 2 rows.

pop_names

(optional) A character vector indicating the populationnames in the same order as in the STRUCTURE file. It is of the same lengthas the number of populations. Without this argument, populations arenumbered from 1 to the total number of individuals.

loci_names

A character vector with the names of the loci if notspecified in the file first row. This argument is mandatory if theSTRUCTURE file does not include the names of the loci in the first row.In other cases, the names of the loci is extracted from the file first row

ind_names

(optional) A character vector indicating the individualnames in the same order as in the STRUCTURE file. It is of the same lengthas the number of individuals. Without this argument, individuals arenumbered from 1 to the total number of individuals.

Details

The column order of the resulting object can be different fromthat of objects returned bygstud_to_genindandgenepop_to_genind, depending on allele and loci codingThis function uses functions frompegas package.For details about STRUCTURE file format :STRUCTURE user manual

Value

An object of typegenind.

Author(s)

P. Savary

Examples

data("data_ex_genind")loci_names <- levels(data_ex_genind@loc.fac)pop_names <- levels(data_ex_genind@pop)ind_names <- row.names(data_ex_genind@tab)path_in <- system.file('extdata', 'data_ex_str.txt',                       package = 'graph4lg')file_n <- file.path(tempdir(), "data_ex_str.txt")file.copy(path_in, file_n, overwrite = TRUE)str <- structure_to_genind(path = file_n, loci_names = loci_names,                           pop_names = pop_names, ind_names = ind_names)file.remove(file_n)