Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Calculate distances between phylogenetic trees in R

NotificationsYou must be signed in to change notification settings

ms609/TreeDist

Repository files navigation

Project Status: The project has reached a stable, usable state but is no longer being actively developed; support/maintenance will be provided as time allows.codecovCRAN Status BadgeCRAN DownloadsDOI

'TreeDist' is an R package that implements a suite of metrics that quantify thetopological distance between pairs of unweighted phylogenetic trees.It also includes a simple 'Shiny' application to allow the visualization ofdistance-based tree spaces, and functions to calculate the information contentof trees and splits.

'TreeDist' primarily employs metrics in the category of'generalized Robinson–Foulds distances': they are based on comparing splits(bipartitions) between trees, and thus reflect the relationship data withintrees, with no reference to branch lengths.

Generalized RF distances

TheRobinson-Foulds distancesimply tallies the number of non-trivial splits (sometimes inaccuratelytermed clades, nodes or edges) that occur in both trees – any splits that arenot perfectly identical contribute one point to the distance score of zero,however similar or different they are.By overlooking potential similarities between almost-identical splits,this conservative approach has undesirable properties.

'Generalized' RF metricsgeneratematchings that pair splits in one tree with similar splits inthe other.Each pair of splits is assigned a similarity score; the sum of these scores inthe optimal matching then quantifies the similarity between two trees.

Different ways of calculating the the similarity between a pair of splitslead to different tree distance metrics, implemented in the functions below:

  • MutualClusteringInfo(),SharedPhylogeneticInfo()

    Smith (2020) scores matchings based on the amount of informationthat one partition contains about the other. The Mutual PhylogeneticInformation assigns zero similarity to split pairs that cannotboth exist on a single tree; The Mutual Clustering Information metric ismore forgiving, and exhibits more desirable behaviour; it is therecommended metric for tree comparison.(Its complement,ClusteringInfoDistance(),returns a tree distance.)

    Introduction to the Clustering Information Distance

  • NyeSimilarity()

    Nyeet al. (2006) score matchings according to the size of the largestsplit that is consistent with both of them, normalized againstthe Jaccard index. This approach is extended by Böckeret al. (2013)with the Jaccard-Robinson-Foulds metric (functionJaccardRobinsonFoulds()).

  • MatchingSplitDistance()

    Bogdanowicz and Giaro (2012) and Linet al. (2012) independently proposedcounting the number of 'mismatched' leaves in a pair of splits.MatchingSplitInfoDistance()provides an information-based equivalent (Smith 2020).

The package also implements the variation of the path distanceproposed by Kendal and Colijn (2016) (functionKendallColijn()),approximations of the Nearest-Neighbour Interchange (NNI) distance (functionNNIDist();following Liet al. (1996)), and calculates the size (functionMASTSize()) andinformation content (functionMASTInfo()) of theMaximum Agreement Subtree.

For an implementation of the Tree Bisection and Reconnection (TBR) distance, seethe package 'TBRDist'.

Installation

Install and load the library from CRAN as follows:

install.packages('TreeDist')library('TreeDist')

You can install the development version of the package with:

if(!require("curl")) install.packages("curl")if(!require("remotes")) install.packages("remotes")remotes::install_github("ms609/TreeDist")

Tree space analysis

Construct tree spaces and readily visualize projected landscapes, avoidingcommon analytical pitfalls (Smith, 2022),using the inbuilt graphical user interface (Shiny GUI):

TreeDist::MapTrees()

image

Serious analysts should consult thevignettefor a command-line interface.

Documentation

See also

Other R packages implementing tree distance functions include:

  • 'ape':
    • cophenetic.phylo(): Cophenetic distance
    • dist.topo(): Path (topological) distance, Robinson-Foulds distance.
  • 'phangorn'
    • treedist(): Path, Robinson-Foulds and approximate SPR distances.
  • 'Quartet': Triplet and Quartet distances,using the tqDist algorithm.
  • 'TBRDist': TBR and SPR distances onunrooted trees, using the 'uspr' C library.
  • 'treespace': Kendall-Colijndistance and tree space visualizations.
  • 'distory' (unmaintained):Geodesic distance

References

Please note that the 'TreeDist' project is released with aContributor Code of Conduct.By contributing to this project, you agree to abide by its terms.


[8]ページ先頭

©2009-2025 Movatter.jp