| Nexus format | |
|---|---|
| Filename extensions | usually.nex or.nxs |
| Internet media type | application/octet-stream |
| Magic number | '#NEXUS\n' |
| Developed by | Maddison DR, Swofford DL, Maddison WP |
| Initial release | December 1997 (28 years ago) (1997-12) |
| Type of format | bioinformatics |
| Open format? | Yes |
The extensibleNEXUS file format is widely used inphylogenetics, evolutionary biology, andbioinformatics. It stores information abouttaxa, morphological character states, DNA and protein sequence alignments, distances, and phylogenetic trees.[1] The NEXUS format also allows the storage of data that can facilitate analyses, such as sets of characters or taxa. Many popular phylogenetic programs, includingPAUP*,[2]MrBayes,[3] Mesquite,[4] MacClade,[5] andSplitsTree,[6] use this format. Nexus file names typically have the extension.nxs or.nex .
A NEXUS file is made out of a fixed header#NEXUS followed by multiple blocks. Each block starts withBEGIN block_name; and ends withEND;. The keywords are case-insensitive.Comments are enclosed inside square brackets[...].[7] Each of the pre-defined types of blocks may appear only once.
| Block Name | Description |
|---|---|
| TAXA | Specifies the OTUs (operational taxonomic units) in data set |
| CHARACTERS | Specifies the character data (e.g., homologous morphological characters or a multiple sequence alignment) |
| DATA | Equivalent to a CHARACTERS block that includes theNewTaxa subcommand in theDimensions command |
| TREES | Stores trees inNewick format |
| DISTANCES | Stores distance matrices |
| SETS | Assigns names to sets of characters (CHARSET) or OTUs (TAXSET) |
| ASSUMPTIONS | Assumptions about the data or directions regarding data treatment (e.g., the character exclusion status) |
The following example NEXUS uses the TAXA, CHARACTERS, and TREES blocks:
#NEXUSBegin TAXA; Dimensions ntax=4; TaxLabels Alpha Beta Gamma Delta;End;Begin CHARACTERS; Dimensions nchar=15; Format datatype=dna missing=? gap=- matchchar=.; Matrix[ When a position is a "matchchar", it means that it is the same as the first entry at the same position. ] Alphaatgctagctagctcg Beta......??...-.a. Gamma...t.......-.g. [ same as atgttagctag-tgg ] Delta...t.......-.a. ;End;Begin TREES; Tree tree1 = ((Alpha,Beta),Gamma,Delta);END;