Citation: Valente LM, Phillimore AB, Etienne RS (2015) Equilibriumand non- equilibrium dynamics simultaneously operate in the Galápagosislands. Ecology Letters, In press.
To load the package:
The raw dataset is inputted as a table. The Galápagos dataset tablecan be visualized with:
| Clade_name | Status | Missing_species | Branching_times |
|---|---|---|---|
| Coccyzus | Non_endemic_MaxAge | 0 | 7.456 |
| Dendroica | Non_endemic | 0 | 0.34 |
| Finches | Endemic | 0 | 3.0282,1.3227,0.8223,0.4286,0.3462,0.245,0.0808,0.0527,0.0327,0.0221,0.118,0.0756,0.0525,0.0322,0.0118 |
| Mimus | Endemic | 0 | 3.958,3.422,2.884,0.459 |
| Myiarchus | Endemic | 0 | 0.855 |
| Progne | Endemic | 0 | 0.086 |
| Pyrocephalus | Non_endemic_MaxAge | 0 | 10.285 |
| Zenaida | Endemic | 0 | 3.51 |
Each row in the table represents an independent colonisation event.The table has four columns:
Clade_name: name of the independent colonizationevent.Status: One of the following categories:Endemic: applicable for both anagenetic species andradiations.Non_endemic: If the taxon is not endemic to the island,and the age of colonisation is based on a phylogeny where both islandand non-island populations of the species have been sampled.Non_endemic_MaxAge: If the taxon is not endemic to theisland, and only an upper bound to the time of colonisation of theisland is known. This applies if individuals from the island populationof the species have not been sampled, but an age of the species isknown.Endemic&Non_Endemic: When an endemic clade ispresent and the mainland ancestor has re-colonized. For remote islandsthis is expected to be very rare.Missing_species: Number of island species that were notsampled for a particular clade (only applicable for radiations).Branching_times – This should be the stem age of thepopulation/species in the case ofNon-endemic,Non-endemic_MaxAge andEndemic anageneticspecies. For cladogenetic species these should be branching times of theradiation including the stem age of the radiation. Note – if there arespecies within the radiation that are not found on the island(e.g. back-colonisation) the branching times of these species should beexcluded, as the mainland species pool is treated as static.The same data can also be visualized:
DAISIE::DAISIE_plot_island(Galapagos_datatable,island_age =4)#> Colonisation time of 7.456 for Coccyzus is older than island age#> Colonisation time of 10.285 for Pyrocephalus is older than island ageBefore running analyses, the datatable needs to be converted to aDAISIE datalist format using the function DAISIE_dataprep.
We will prepare two different datalists based on the Galápagosdatatable. In the 1st datalist we will treat all taxa as equivalent. Wewill specify an island age of four million years (island_age=4) and amainland pool size of 1000 (M=1000).
data(Galapagos_datatable)Galapagos_datalist<-DAISIE_dataprep(datatable = Galapagos_datatable,island_age =4,M =1000)#> Colonisation time of 7.456 for Coccyzus is older than island age#> Colonisation time of 10.285 for Pyrocephalus is older than island ageIn the 2nd datalist we will allow for the Darwin’s finches to form aseparate group for which rates can be decoupled from those governing themacroevolutionary process in all other clades (number_clade_types=2 andlist_type2_clades = “Finches”). We will set the proportion of Darwin’sfinch type species in the mainland pool to be 0.163.(prop_type2_pool=0.163). If prop_type2_pool is not specified then bydefault it is given the value of the proportion of the Galapagoslineages that Darwin’s finches represent (1/8=0.125 in this case).
data(Galapagos_datatable)Galapagos_datalist_2types<-DAISIE_dataprep(datatable = Galapagos_datatable,island_age =4,M =1000,number_clade_types =2,list_type2_clades ="Finches",prop_type2_pool =0.163)#> Colonisation time of 7.456 for Coccyzus is older than island age#> Colonisation time of 10.285 for Pyrocephalus is older than island ageThe objectsGalapagos_datalist andGalapagos_datalist_2types can now be run directly inDAISIE functions.
The function that conducts maximum likelihood optimization of DAISIEmodel parameters is calledDAISIE_ML.
Different models can be specified using ddmodel option inDAISIE_ML:
ddmodel = 0 : no diversity-dependenceddmodel = 1 : linear diversity-dependence in speciationrateddmodel = 11: linear diversity-dependence in speciationand immigration rateddmodel = 2 : exponential diversity-dependence inspeciation rateddmodel = 21: exponential diversity-dependence inspeciation and immigration rateDifferent types of parameters can be optimized or fixed. Theparameters are given in the following order: (1) cladogenesis rate, (2)extinction rate, (3) K’ or carrying capacity (maximum number of speciesthat a clade can attain within the island), (4) colonisation rate, and(5) anagenesis rate.
The identities of the parameters to be optimized or fixed arespecified withidparsopt andidparsfix withinthe DAISIE_ML function. For example, to optimize all parameters we setidparsopt=1:5 andidparsfix=NULL. To optimizeall parameters but fix the rate of extinction, we setidparsopt=c(1,3,4,5) andidparsfix=2. Tooptimize all parameters except cladogenesis and anagenesis we setidparsopt=c(2,3,4) andidparsfix=c(1,5).
The values of the parameters to be used as initial values for theoptimization are specified withinitparsopt, and the valuesto be fixed are specified withparsfix. For example, if wewant to optimize all parameters with a starting value of 2 we setinitparsopt=c(2,2,2,2,2) andparsfix=NULL. Ifwe want all starting rates to be 0.1, but K’ to be fixed at 20, we useinitparsopt=c(0.1,0.1,0.1,0.1) andparsfix=20.
When running your own data, we strongly recommend that you testmultiple initial starting parameters for each model, particularly whenoptimizing models with multiple free parameters, as there is a high riskof being trapped in local likelihood sub-optima. We also suggest runningtwo rounds of optimization using the optimized parameter set of the 1stround as the initial starting values for the 2nd round. Also note thatthe initial starting values in the examples of this tutorial may not beappropriate for your data.
We will now optimize all five parameters for a datalist where allclades share the same parameters. We will set the model with lineardiversity-dependence in speciation rate and in immigration rate usingddmodel=11. We will set an initial rate of cladogenesis of 2.5, aninitial rate of extinction of 2.7, an initial K’ value of 20, an initialrate of colonisation of 0.009 and an initial rate of anagenesis of 1.01(initparsopt = c(2.5,2.7,20,0.009,1.01)). We will optimizeall 5 parameters (idparsopt = 1:5) and we will fix noparameters (parsfix = NULL,idparsfix = NULL).
data(Galapagos_datalist) DAISIE_ML( datalist = Galapagos_datalist, initparsopt = c(2.5,2.7,20,0.009,1.01), ddmodel = 11, idparsopt = 1:5, parsfix = NULL, idparsfix = NULL)This will take several minutes to run. The parameters optimized andfixed as well as the loglikelihood of the initial starting parameters wehave set are shown at the top of the screen output of DAISIE_ML. Oncethe optimization is completed, the program will output the maximumlikelihood parameter estimates and the maximum loglikelihood value. Fora given dataset, the likelihood of different DAISIE models can becompared with information criteria such as BIC and AIC.
To optimize the parameters of a model with no diversity-dependence,we use the default model (ddmodel=0), and fix the parameter number 3which corresponds to K’ to infinity (Inf).
data(Galapagos_datalist) DAISIE_ML( datalist = Galapagos_datalist, initparsopt = c(2.5,2.7,0.009,1.01), idparsopt = c(1,2,4,5), parsfix = Inf, idparsfix = 3)To optimize the parameters of a model with no diversity-dependenceand no anagenesis, we use the default model (ddmodel=0), and fixparameters number 3 and 5, which correspond, respectively to K’ and rateof anagenesis.
data(Galapagos_datalist) DAISIE_ML( datalist=Galapagos_datalist, initparsopt = c(2.5,2.7,0.009), idparsopt = c(1,2,4), parsfix = c(Inf,0), idparsfix = c(3,5))For this example we will use the datalist with Darwin’s finchesspecified to be of a separate type: Galapagos_datalist_2types.
If two types of species are considered, then the parameters of thesecond type of species are in the same order as the first set ofparameters, but start at number 6: (6) cladogenesis rate of type 2species, (7) extinction rate of type 2 species, (8) K’ of type 2species, (9) colonisation rate of type 2 species, and (10) anagenesisrate of type 2 species. There is also an additional parameter when 2types of species are considered: the proportion of species of type 2 inthe mainland pool. This is parameter number 11.
Here we will optimize all parameters, but allow the finches to have aseparate rate of cladogenesis. We will fix the proportion of type 2species in the mainland pool at 0.163 (therefore fixing parameter 11with idparsfix=11 and parsfix=0.163). Note that because we are onlyallowing the rate of cladogenesis of Darwin’s finches to vary from thebackground rate, we need to specify that the other rates for Darwin’sfinches remain the same as the background – using idparsnoshift =c(7,8,9,10)).
data(Galapagos_datalist_2types) DAISIE_ML( ddmodel=11, datalist=Galapagos_datalist_2types, initparsopt= c(0.38,0.55,20,0.004,1.1,2.28), idparsopt = c(1,2,3,4,5,6), parsfix = 0.163, idparsfix = c(11), idparsnoshift = c(7,8,9,10))data(Galapagos_datalist_2types) DAISIE_ML( ddmodel=0, datalist=Galapagos_datalist_2types, initparsopt = c(0.38,0.55,0.004,1.1,2.28,2), idparsopt = c(1,2,4,5,6,7), parsfix = c(Inf,0.163), idparsfix = c(3,11), idparsnoshift = c(8,9,10))