- Notifications
You must be signed in to change notification settings - Fork4
Handling taxonomic lists
ropensci/taxlist
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
taxlist is a package designed to handle and assess taxonomic lists inR, providing an object class and functions inS4 language. Thehomonymous object classtaxlist was originally designed as a modulefor taxa recorded in vegetation-plot observations (seevegtable), but became as anindependent object with the ability of contain not only lists of speciesbut also synonymy, hierarchical taxonomy, and functional traits(attributes of taxa).
The main aim of this package is to keep consistence in taxonomic lists(a set of rules are checked by the functionvalidObject()), to enablethe re-arrangement of such data, and to statistically assess functionaltraits and other attributes, for instance taxonomy itself (functiontax2traits() set taxonomic information as trait).
While this package only includes a function for the import of taxonomiclists fromTurboveg,almost any data source can be structured astaxlist object, so far theinformation is imported into data frames in an R session and theconsistency rules are respected (validity).
The use oftaxlist is recommended for people cleaning raw data beforeimporting it to relational databases, either in the context of taxonomicwork or biodiversity assessments. The other way around, people havingrelational databases or clean and structured taxonomic lists may usetaxlist as recipient of this information in R sessions in order tocarry out further statistical assessments. Finally, the functionprint_name() makestaxlist suitable for its implementation ininteractive documents usingrmarkdonw andknitr (e.g. reports,manuscripts and check-lists).
The structure oftaxlist objects is inspired on the structure of datahandled byTurboveg andrelational databases.

Figure: Relational model fortaxlist objects (seeAlvarez & Luebert2018).
This package is available from the Comprehensive R Archive Network(CRAN) and can be directly installed within an R-session:
install.packages("taxlist",dependencies=TRUE)
Alternatively, the current development version is available fromGitHub and can be installed usingthe packagedevtools:
library(devtools)install_github("ropensci/taxlist",build_vignette=TRUE)
A vignette is installed with this package introducing to the work withtaxlist and can be accessed by following command in your R-session:
vignette("taxlist-intro")Objects can be built step-by-step as in the following example. For it,we will use as reference the “Ferns of Chile” (original in Spanish:“Helechos de Chile”) byGunkel (1984). We will create an emptytaxlist object using the functionnew():
library(taxlist)#>#> Attaching package: 'taxlist'#> The following objects are masked from 'package:base':#>#> levels, levels<-, printFern<- new("taxlist")Fern#> object size: 5.1 Kb#> validation of 'taxlist' object: TRUE#>#> number of taxon usage names: 0#> number of taxon concepts: 0#> trait entries: 0#> number of trait variables: 0#> taxon views: 0
Then we have to set the respective taxonomic ranks. In such case, thelevels have to be provided from the lowest to highest hierarchicallevel:
levels(Fern)<- c("variety","species","genus")
For convenience, we start inserting taxa with their respective names ina top-down direction. We will use the functionadd_concept() to add anew taxon. Note that the argumentsTaxonName,AuthorName, andLevel are used to provide the name of the taxon, the authority of thename and the taxonomic rank, respectively.
Fern<- add_concept(taxlist=Fern,TaxonName="Asplenium",AuthorName="L.",Level="genus")summary(Fern,"all")#> ------------------------------#> concept ID: 1#> view ID: none#> level: genus#> parent: none#>#> # accepted name:#> 1 Asplenium L.#> ------------------------------
As you see, the inserted genus got the concept ID1 (seeTaxonConceptID in the previous figure). To insert a species of thisgenus, we use again the functionadd_concept(), but this time we willalso provide the ID of the parent taxon with the argumentParent.
Fern<- add_concept(Fern,TaxonName="Asplenium obliquum",AuthorName="Forster",Level="species",Parent=1)summary(Fern,"Asplenium obliquum")#> ------------------------------#> concept ID: 2#> view ID: none#> level: species#> parent: 1 Asplenium L.#>#> # accepted name:#> 2 Asplenium obliquum Forster#> ------------------------------
In the same way, we can add now two varieties of the inserted species:
Fern<- add_concept(Fern,TaxonName= c("Asplenium obliquum var. sphenoides","Asplenium obliquum var. chondrophyllum" ),AuthorName= c("(Kunze) Espinosa","(Bertero apud Colla) C. Christense & C. Skottsberg" ),Level="variety",Parent= c(2,2))
You may have realized that the functionsummary() is applied toprovide on the one side a display of meta-information for the wholetaxlist object, and on the other side to show a detail of the taxaincluded in the object. In the later case adding the keyword"all" assecond argument, the summary will show a detailed information for everytaxon included in the object.
Fern#> object size: 6.2 Kb#> validation of 'taxlist' object: TRUE#>#> number of taxon usage names: 4#> number of taxon concepts: 4#> trait entries: 0#> number of trait variables: 0#> taxon views: 0#>#> concepts with parents: 3#> concepts with children: 2#>#> concepts with rank information: 4#> concepts without rank information: 0#>#> genus: 1#> species: 1#> variety: 2summary(Fern,"all")#> ------------------------------#> concept ID: 1#> view ID: none#> level: genus#> parent: none#>#> # accepted name:#> 1 Asplenium L.#> ------------------------------#> concept ID: 2#> view ID: none#> level: species#> parent: 1 Asplenium L.#>#> # accepted name:#> 2 Asplenium obliquum Forster#> ------------------------------#> concept ID: 3#> view ID: none#> level: variety#> parent: 2 Asplenium obliquum Forster#>#> # accepted name:#> 3 Asplenium obliquum var. sphenoides (Kunze) Espinosa#> ------------------------------#> concept ID: 4#> view ID: none#> level: variety#> parent: 2 Asplenium obliquum Forster#>#> # accepted name:#> 4 Asplenium obliquum var. chondrophyllum (Bertero apud Colla) C. Christense & C. Skottsberg#> ------------------------------
A feature implemented in version 0.2.1 is the functionindented_list(), which provides a better display on the hierarchicalstrucutre oftaxlist objects.
indented_list(Fern)#> Asplenium L.#> Asplenium obliquum Forster#> Asplenium obliquum var. sphenoides (Kunze) Espinosa#> Asplenium obliquum var. chondrophyllum (Bertero apud Colla) C. Christense & C. Skottsberg
A more convenient way is to create an object from a data frame includingboth, the taxon concepts with their accepted names and the taxonomicranks with parent-child relationships. In the case of the last example,the required data frame looks like this one:
Fern_df<-data.frame(TaxonConceptID=1:4,TaxonUsageID=1:4,TaxonName= c("Asplenium","Asplenium obliquum","Asplenium obliquum var. sphenoides","Asplenium obliquum var. chondrophyllum" ),AuthorName= c("L.","Forster","(Kunze) Espinosa","(Bertero apud Colla) C. Christense & C. Skottsberg" ),Level= c("genus","species","variety","variety"),Parent= c(NA,1,2,2),stringsAsFactors=FALSE)Fern_df#> TaxonConceptID TaxonUsageID TaxonName#> 1 1 1 Asplenium#> 2 2 2 Asplenium obliquum#> 3 3 3 Asplenium obliquum var. sphenoides#> 4 4 4 Asplenium obliquum var. chondrophyllum#> AuthorName Level Parent#> 1 L. genus NA#> 2 Forster species 1#> 3 (Kunze) Espinosa variety 2#> 4 (Bertero apud Colla) C. Christense & C. Skottsberg variety 2
This kind of tables can be written in a spreadsheet application andimported to your R session. The two first columns correspond to the IDsof the taxon concept and the respective accepted name. They can becustom IDs but are restricted to integers intaxlist. For the use ofthe functiondf2taxlist(), the two first columns are mandatory. Alsonote that the columnParent is pointing to the concept IDs of therespective parent taxon. To get the object, we just use thedf2taxlist() indicating the sequence of taxonomic ranks in theargumentlevels.
Fern2<- df2taxlist(Fern_df,levels= c("variety","species","genus"))#> No values for 'AcceptedName' in 'x'. all names will be considered as accepted names.Fern2#> object size: 6.2 Kb#> validation of 'taxlist' object: TRUE#>#> number of taxon usage names: 4#> number of taxon concepts: 4#> trait entries: 0#> number of trait variables: 0#> taxon views: 0#>#> concepts with parents: 3#> concepts with children: 2#>#> concepts with rank information: 4#> concepts without rank information: 0#>#> genus: 1#> species: 1#> variety: 2
The packagetaxlist shares similar objectives with the packagetaxa, but uses differentapproaches for object oriented programming inR, namelytaxlistappliesS4 whiletaxa usesR6. Additionally,taxa is ratherdeveloper-oriented, whiletaxlist is rather a user-oriented package.
In following cases you may prefer to usetaxlist:
- When you need an automatic check on the consistency of informationregarding taxonomic ranks and parent-child relationships (parents haveto be of a higher rank then children), as well as non-duplicatedcombinations of names and authors. Such checks are done by thefunction
validObject(). - When you foresee statistical assessments on taxonomy diversity ortaxon properties (chorology, conservation status, functional traits,etc.).
- When you seek to produce documents usingrmarkdown, for instanceguide books or check-lists. Also in article manuscripts taxonomicnames referring to a taxon concept can easily get formatted by thefunction
print_name(). - When importing taxonomic lists from databases stored inTurboveg2.
- When you seek to implement the package
vegtablefor handlingand assessing biodiversity records, especially vegetation-plot data.In that case, taxonomic lists will be formatted bytaxlistas a slotwithin avegtableobject.
As mentioned before,taxlist objects can be also used for writingrmarkdown documents (seethisposter). For instanceyou can insert your objects at the beginning of the document with ahidden chunk:
```{r echo=FALSE, message=FALSE, warning=FALSE}library(taxlist)data(Easplist)```
To mention a taxon, you can write in-line codes, such as`rprint_name(Easplist, 206)`, which will insertCyperus papyrusL. in your document (note that the number is the ID of the taxon conceptinEasplist). Fort a second mention of the same species, you can thenuse`r print_name(Easplist, 206, second_mention=TRUE)`,which will insertC. papyrus L. in your text.
Information located in the slottaxonTraits are suitable forstatistical assessments. For instance, in the installed objectEasplist a column calledlife_form includes a classification ofmacrophytes into different life forms. To know the frequency of theselife forms in theEasplist, we can use the functioncount_taxa():
# how man taxa in 'Easplist'count_taxa(Easplist)#> [1] 3887# frequency of life formscount_taxa(~life_form,Easplist)#> life_form taxa_count#> 1 acropleustophyte 8#> 2 chamaephyte 25#> 3 climbing_plant 25#> 4 facultative_annual 20#> 5 obligate_annual 114#> 6 phanerophyte 26#> 7 pleustohelophyte 8#> 8 reed_plant 14#> 9 reptant_plant 19#> 10 tussock_plant 52
Furthermore, taxonomic information can be also transferred to this slotusing the functiontax2traits(). By this way we will make taxonomicranks suitable for frequency calculations.
Easplist<- tax2traits(Easplist,get_names=TRUE)head(Easplist@taxonTraits)#> TaxonConceptID life_form form variety subspecies#> 1 7 phanerophyte <NA> <NA> <NA>#> 2 9 phanerophyte <NA> <NA> <NA>#> 3 18 facultative_annual <NA> <NA> <NA>#> 4 20 facultative_annual <NA> <NA> <NA>#> 5 21 obligate_annual <NA> <NA> <NA>#> 6 22 chamaephyte <NA> <NA> <NA>#> species complex genus family#> 1 Acacia mearnsii <NA> Acacia Leguminosae#> 2 Acacia polyacantha <NA> Acacia Leguminosae#> 3 Achyranthes aspera <NA> Achyranthes Amaranthaceae#> 4 Acmella caulirhiza <NA> Acmella Compositae#> 5 Acmella uliginosa <NA> Acmella Compositae#> 6 Aeschynomene schimperi <NA> Aeschynomene Leguminosae
Note that the respective parental ranks are inserted in the tabletaxonTraits, which contains the attributes of the taxa. In the twonext command lines, we will produce a subset with only members of thefamily Cyperaceae and then calculate the frequency of species pergenera.
Cype<- subset(Easplist,family=="Cyperaceae",slot="taxonTraits")Cype_stat<- count_taxa(species~genus,Cype)
Now, we can sort them to produce a nice bar plot.
Cype_stat<-Cype_stat[order(Cype_stat$species_count,decreasing=TRUE), ]par(las=2,mar= c(10,5,1,1))with(Cype_stat, barplot(species_count,names.arg=genus,ylab="Number of Species"))
The author thanksStephan Hennekens, developer ofTurboveg, for his patienceand great support finding a common language betweenR andTurboveg, as well as for his advices on formatting our taxonomiclistEA-Splist.
Also thanks toFederico Luebert for the fruitful discussionsregarding the terminology used in this project.
About
Handling taxonomic lists
Resources
Code of conduct
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors2
Uh oh!
There was an error while loading.Please reload this page.

