- Notifications
You must be signed in to change notification settings - Fork2
AaGillet/MorphoRegions
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
MorphoRegions is an R package built to computationally identifyregions (morphological, functional, etc.) in serially homologousstructures such as, but not limited to, the vertebrate backbone. Regionsare modeled as segmented linear regressions with each segmentcorresponding to a region and region boundaries (or breakpoints)corresponding to changes along the serially homologous structure. Theoptimal number of regions and their breakpoint positions are identifiedusing maximum-likelihood methods withouta priori assumptions.
This package was first presented inGillet etal. (2024) and is anupdated version of theregions Rpackage fromJones etal. (2018)with improved computational methods and expanded fitting and plottingoptions.
You can install the released version ofMorphoRegions fromCRAN with:
install.packages("MorphoRegions")Or the development version fromGitHub with:
# install.packages("remotes")remotes::install_github("AaGillet/MorphoRegions")
The following example illustrates the basic steps to prepare the data,fit regionalization models, select the best model, and plot the results.Seevignette("MorphoRegions") or theMorphoRegionswebsite for a detailed guideof the package and its functionalities.
library(MorphoRegions)Data should be provided as a dataframe where each row is an element ofthe serially homologous structure (e.g., a vertebra). One column shouldcontain positional information of each element (e.g., vertebral number)and other columns should contain variables that will be used tocalculate regions (e.g., morphological measurements). Thedolphindataset contains vertebral measurements of a dolphin with the positionalinformation (vertebral number) in the first column.
data("dolphin")| Vertebra | Lc | Wc | Hc | Hnp | Wnp | Inp | Ha | Wa | Lm | Wm | Hm | Hch | Wch | Ltp | Wtp | Itp | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 8 | 8 | 1.33 | 3.37 | 2.02 | 2.85 | 1.17 | 2.01 | 1.72 | 1.48 | 0.00 | 0.00 | 0.0 | 0 | 0 | 1.71 | 1.67 | 1.57 |
| 9 | 9 | 1.46 | 3.67 | 2.10 | 3.20 | 1.63 | 2.01 | 1.44 | 1.65 | 0.00 | 0.00 | 0.0 | 0 | 0 | 1.51 | 1.61 | 1.57 |
| 10 | 10 | 1.57 | 3.62 | 2.26 | 3.13 | 1.71 | 2.01 | 1.42 | 2.18 | 0.00 | 0.00 | 0.0 | 0 | 0 | 1.06 | 1.90 | 1.57 |
| 11 | 11 | 1.71 | 3.75 | 2.24 | 3.07 | 1.71 | 2.01 | 1.38 | 1.25 | 0.56 | 0.38 | 1.7 | 0 | 0 | 1.03 | 1.91 | 1.66 |
| 12 | 12 | 1.74 | 3.72 | 2.28 | 2.66 | 1.96 | 1.99 | 1.30 | 1.50 | 1.45 | 1.09 | 2.0 | 0 | 0 | 0.60 | 1.71 | 1.57 |
| 13 | 13 | 1.82 | 3.92 | 2.28 | 2.61 | 1.74 | 1.88 | 1.29 | 1.74 | 1.86 | 1.12 | 2.0 | 0 | 0 | 0.37 | 1.44 | 1.57 |
Prior to analysis, data must be processed into an object usable byMorphoRegions usingprocess_measurements(). Thepos argument isused to specify the name or index of the column containing positionalinformation and thefillNA argument allows to fill missing values inthe dataset (up to two successive elements).
dolphin_data<- process_measurements(dolphin,pos=1)class(dolphin_data)#> [1] "regions_data"
Data are then ordinated using a Principal Coordinates Analysis (PCO) toreduce dimensionality and allow the combination of a variety of datatypes. The number of PCOs to retain for analyses can be selected usingPCOselect() (see the vignette for different methods of PCO axesselection).
dolphin_pco<- svdPCO(dolphin_data,metric="gower")# Select PCOs with variance > 0.05 :PCOs<- PCOselect(dolphin_pco,method="variance",cutoff=.05)PCOs#> A `regions_pco_select` object#> - PCO scores selected: 1, 2#> - Method: variance (cutoff: 0.05)
Thecalcregions() function allows fitting all possible combinations ofsegmented linear regressions from 1 region (no breakpoint) to the numberof regions specified in thenoregions argument. In this example, up to5 regions (4 breakpoints) will be fitted along the backbone, however,there is no limit for this value and it is possible to fit as manyregions as you would like. For this example, regions will be fitted witha minimum of 3 vertebrae per region (minvert = 3) and using acontinuous fit (cont = TRUE) (seevignette("MorphoRegions") orMorphoRegions website fordetails about fitting options).
regionresults<- calcregions(dolphin_pco,scores=PCOs,noregions=5,minvert=3,cont=TRUE,exhaus=TRUE,verbose=FALSE)regionresults#> A `regions_results` object#> - number of PCOs used: 2#> - number of regions: 1, 2, 3, 4, 5#> - model type: continuous#> - min vertebrae per region: 3#> - total models saved: 28810#> Use `summary()` to examine summaries of the fitting process.
For each given number of regions, the best fit is selected by minimizingthe residual sum of squares (sumRSS):
models<- modelselect(regionresults)models#> Regions BP 1 BP 2 BP 3 BP 4 sumRSS RSS.1 RSS.2#> 1 . . . . 1.898 1.456 0.441#> 2 26 . . . 0.413 0.105 0.308#> 3 23 29 . . 0.147 0.092 0.055#> 4 23 30 40 . 0.073 0.034 0.040#> 5 23 27 34 40 0.046 0.026 0.020
The best overall model (best number of regions) is then select byordering models from the best fit (top row) to the worst fit (last row)using either the AICc or BIC criterion:
supp<- modelsupport(models)supp#> - Model support (AICc)#> Regions BP 1 BP 2 BP 3 BP 4 sumRSS AICc deltaAIC model_lik Ak_weight#> 5 23 27 34 40 0.046 -556.036 0.000 1 1#> 4 23 30 40 . 0.073 -528.096 27.940 0 0#> 3 23 29 . . 0.147 -480.952 75.084 0 0#> 2 26 . . . 0.413 -405.787 150.250 0 0#> 1 . . . . 1.898 -290.769 265.267 0 0#> Region score: 5#>#> - Model support (BIC)#> Regions BP 1 BP 2 BP 3 BP 4 sumRSS BIC deltaBIC model_lik BIC_weight#> 5 23 27 34 40 0.046 -526.559 0.000 1 1#> 4 23 30 40 . 0.073 -502.645 23.914 0 0#> 3 23 29 . . 0.147 -460.321 66.238 0 0#> 2 26 . . . 0.413 -390.668 135.891 0 0#> 1 . . . . 1.898 -281.774 244.784 0 0#> Region score: 5
Here, for both criteria, the best model is the 5 regions models withbreakpoints at vertebrae 23, 27, 34, and 40.The breakpoint valuecorresponds to the last vertebra included in the region, so the firstregion here is made of vertebrae 8 to 23 included and the second regionis made of vertebrae 24 to 27. The function also returns theregionscore, a continuous value reflecting the level of regionalizationwhile accounting for uncertainty in the best number of regions (seevignette("MorphoRegions") orMorphoRegionswebsite for more details).
Results of the best model (or any other model) can be visualized eitheras a scatter plot or as a vertebral map.
Thescatter plot shows the PCO score (here for PCO 1 and 2) of eachvertebra along the backbone (gray dots) and the segmented linearregressions (cyan line) of the model to plot. Breakpoints are showed bydotted orange lines.
plotsegreg(dolphin_pco,scores=1:2,modelsupport=supp,criterion="bic",model=1)
In thevertebral map plot, each vertebra is represented by arectangle color-coded according to the region to which it belongs.Vertebrae not included in the analysis (here vertebrae 1 to 7) arerepresented by gray rectangles and can be removed usingdropNA = TRUE.
plotvertmap(dolphin_pco,name="Dolphin",modelsupport=supp,criterion="bic",model=1)plotvertmap(dolphin_pco,name="Dolphin",modelsupport=supp,criterion="bic",model=1,dropNA=TRUE)
The variability around breakpoint positions can be calculated usingcalcBPvar() and then displayed on the vertebral map. The weightedaverage position of each breakpoint is shown by the black dot and theweighted variance is illustrated by the horizontal black bar.
bpvar<- calcBPvar(regionresults,noregions=5,pct=0.1,criterion="bic")plotvertmap(dolphin_pco,name="Dolphin",dropNA=TRUE,bpvar=bpvar)
To citeMorphoRegions, please use:
citation("MorphoRegions")Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors2
Uh oh!
There was an error while loading.Please reload this page.



