- Notifications
You must be signed in to change notification settings - Fork40
R interface to the fishbase.org database
ropensci/rfishbase
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Welcome torfishbase 5! This is the fourth rewrite of the originalrfishbase package described inBoettiger etal. (2012).
Another streamlined re-design following new abilities for data hostingand access. This release usesSource CooperativeS3 storage for data and metadata hosting in parquet format, providingimproved reliability and resolving firewall issues that some usersexperienced with previous hosting solutions.
Data access is simplified through direct S3 API queries instead of theprevious contentid-based resolution. This allows metadata to be definedalongside the data, platform independent of the R package.
A simplified access protocol relies onduckdbfs for direct reads oftables. Several functions previously used only to manage connections arenow deprecated or removed, along with a significant number ofdependencies.
Core use still centers around the same package API using thefb_tbl()function, with legacy helper functions for common tables likespecies() are still accessible and can still optionally filter byspecies name where appropriate. As before, loading the full tables andsub-setting manually is still recommended.
Historic helper functions likeload_taxa() (combining the taxonomicclassification from Species, Genus, Family and Order tables),validate_names(), andcommon_to_sci() andsci_to_common() shouldbe in working order, all using table-based outputs.
rfishbase 1.0relied on parsing of XML pages served directly fromFishbase.org.rfishbase 2.0relied on calls to a ruby-based API,fishbaseapi,that provided access to SQL snapshots of about 20 of the more populartables in FishBase or SeaLifeBase.rfishbase 3.0side-stepped the API by making queries which directlydownloaded compressed csv tables from a static web host. Thissubstantially improved performance a reliability, particularly forlarge queries. The release largely remained backwards compatible with2.0, and added more tables.rfishbase 4.0extends the static model and interface. Static tablesare distributed in parquet and accessed through a provenance-basedidentifier. While old functions are retained, a new interface isintroduced to provide easy access to all fishbase tables.
We welcome any feedback, issues or questions that users may encounterthrough our issues tracker on GitHub:https://github.com/ropensci/rfishbase/issues
remotes::install_github("ropensci/rfishbase")
library("rfishbase")library("dplyr")# convenient but not required
All fishbase tables can be accessed by name using thefb_tbl()function:
fb_tbl("ecosystem")# A tibble: 160,334 × 18 autoctr E_CODE EcosystemRefno Speccode Stockcode Status CurrentPresence <int> <int> <int> <int> <int> <chr> <chr> 1 1 1 50628 549 565 native Present 2 2 1 189 552 568 native Present 3 3 1 189 554 570 native Present 4 4 1 79732 873 889 native Present 5 5 1 5217 948 964 native Present 6 7 1 39852 956 972 native Present 7 8 1 39852 957 973 native Present 8 9 1 39852 958 974 native Present 9 10 1 188 1526 1719 native Present 10 11 1 188 1626 1819 native Present # ℹ 160,324 more rows# ℹ 11 more variables: Abundance <chr>, LifeStage <chr>, Remarks <chr>,# Entered <int>, Dateentered <dttm>, Modified <int>, Datemodified <dttm>,# Expert <int>, Datechecked <dttm>, WebURL <chr>, TS <dttm>You can see all the tables usingfb_tables() to see a list of all thetable names (specifysealifebase if desired). Careful, there are a lotof them! The fishbase databases have grown a lot in the decades, andwere not intended to be used directly by most end-users, so you may haveconsiderable work to determine what’s what. Keep in mind that manyvariables can be estimated in different ways (e.g. trophic level), andthus may report different values in different tables. Also note thatspecies is name (or SpecCode) is not always the primary key for a table– many tables are specific to stocks or even individual samples, andsome tables are reference lists that are not species focused at all, butmeant to be joined to other tables (faoareas, etc). Compare tablesagainst what you see on fishbase.org, or ask on our issues forum foradvice!
fish<- c("Oreochromis niloticus","Salmo trutta")fb_tbl("species") %>% mutate(sci_name= paste(Genus,Species)) %>% filter(sci_name%in%fish) %>% select(sci_name,FBname,Length)
# A tibble: 2 × 3 sci_name FBname Length <chr> <chr> <dbl>1 Oreochromis niloticus Nile tilapia 602 Salmo trutta Sea trout 140In most tables, species are identified bySpecCode (as per bestpractices) rather than scientific names. Multiple tables can be joinedon theSpecCode to more fully describe a species.
To filter species by taxonomic names, use the taxa table fromload_taxa(), which provides a joined table of taxonomy from subspeciesup through Class, along with the corresponding FishBase taxon ids codes.Here is an example workflow joining two of the spawning tables andfiltering to the grouper family,Epinephelidae:
library(rfishbase)library(dplyr)## Get the whole spawning and spawn agg table, joined together:spawn<- left_join(fb_tbl("spawning"), fb_tbl("spawnagg"),relationship="many-to-many")# Filter taxa down to the desired speciesgroupers<- load_taxa()|> filter(Family=="Epinephelidae")## A "filtering join" (inner join)spawn|> inner_join(groupers)
# A tibble: 227 × 95 autoctr StockCode SpecCode SpawningRefNo SourceRef C_Code E_CODE <int> <int> <int> <int> <int> <chr> <int> 1 18 18 12 5222 3092 528A NA 2 19 18 12 26409 1784 388 145 3 20 20 14 26409 NA 192 NA 4 9147 20 14 118249 118249 826E 8 5 22 21 15 5241 5241 630 NA 6 23 21 15 5241 6484 388 NA 7 24 21 15 5241 3095 060 NA 8 24 21 15 5241 3095 060 NA 9 24 21 15 5241 3095 060 NA10 24 21 15 5241 3095 060 NA# ℹ 217 more rows# ℹ 88 more variables: SpawningGround <chr>, Spawningarea <chr>, Jan <dbl>,# Feb <dbl>, Mar <dbl>, Apr <dbl>, May <dbl>, Jun <dbl>, Jul <dbl>,# Aug <dbl>, Sep <dbl>, Oct <dbl>, Nov <dbl>, Dec <dbl>, GSI <int>,# PercentFemales <int>, TempLow <dbl>, TempHigh <dbl>, SexRatiomid <dbl>,# SexRmodRef <int>, FecundityMin <int>, WeightMin <dbl>,# LengthFecunMin <dbl>, LengthTypeFecMin <chr>, FecundityRef <int>, …Always keep in mind that taxonomy is a dynamic concept. Species can besplit or lumped based on new evidence, and naming authorities candisagree over which name is an ‘accepted name’ or ‘synonym’ for anygiven species. When providing your own list of species names, considerfirst checking that those names are “valid” in the current taxonomyestablished by FishBase:
validate_names("Abramites ternetzi")[1] "Abramites hypselonotus"rfishbase can also provide tables ofsynonyms(), a table ofcommon_names() in multiple languages, and convertcommon_to_sci() orsci_to_common()
common_to_sci(c("Bicolor cleaner wrasse","humphead parrotfish"),Language="English")
# A tibble: 5 × 4 Species ComName Language SpecCode <chr> <chr> <chr> <int>1 Labroides bicolor Bicolor cleaner wrasse English 56502 Chlorurus cyanescens Blue humphead parrotfish English 79093 Bolbometopon muricatum Green humphead parrotfish English 55374 Bolbometopon muricatum Humphead parrotfish English 55375 Chlorurus oedema Uniform humphead parrotfish English 8394Note that the results are returned as a table, potentially indicatingother common names for the same species, as well as potentiallydifferent species that match the provided common name! Please always becareful with names, and use unique SpecCodes to refer to unique species.
SeaLifeBase.org is maintained by the same organization and largelyparallels the database structure of Fishbase. As such, almost allrfishbase functions can instead be instructed to address the
fb_tbl("species","sealifebase")
# A tibble: 102,464 × 111 SpecCode Genus Species Author SpeciesRefNo FBname FamCode Subfamily GenCode <int> <chr> <chr> <chr> <int> <chr> <int> <chr> <int> 1 57969 Abdopus horrid… (D'Or… 96968 Red S… 1890 Octopodi… 24384 2 57836 Abdopus tenebr… (Smit… 19 <NA> 1890 Octopodi… 24384 3 57142 Abdopus tongan… (Hoyl… 19 <NA> 1890 Octopodi… 24384 4 2381155 Abdopus undula… Huffa… 84307 <NA> 1890 <NA> 24384 5 14647 Abebai… troglo… Vande… 19 <NA> 572 <NA> 9260 6 165283 Aberom… muranoi Baces… 104101 <NA> 616 <NA> 33537 7 140720 Aberra… banyul… Macki… 85340 <NA> 174 <NA> 9262 8 40346 Aberra… enigma… unspe… 19 <NA> 174 <NA> 9262 9 20199 Aberra… aberra… (Barn… 19 <NA> 308 <NA> 926310 93706 Aberro… verruc… Kasat… 3696 <NA> 922 <NA> 17969# ℹ 102,454 more rows# ℹ 102 more variables: TaxIssue <int>, Remark <chr>, PicPreferredName <chr>,# PicPreferredNameM <chr>, PicPreferredNameF <chr>, PicPreferredNameJ <chr>,# Source <chr>, AuthorRef <int>, SubGenCode <int>, Fresh <int>, Brack <int>,# Saltwater <int>, Land <int>, BodyShapeI <chr>, DemersPelag <chr>,# Amphibious <chr>, AmphibiousRef <int>, AnaCat <chr>, MigratRef <int>,# DepthRangeShallow <int>, DepthRangeDeep <int>, DepthRangeRef <int>, …By default, tables are downloaded the first time they are used.rfishbase defaults to download the latest available snapshot; be awarethat the most recent snapshot may be months behind the latest data onfishbase.org. Check available releases:
available_releases()
[1] "19.04" "21.06" "23.01" "23.05" "24.07"Please note that this package is released with aContributor Code ofConduct. By contributing to thisproject, you agree to abide by its terms.
About
R interface to the fishbase.org database
Topics
Resources
Code of conduct
Contributing
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
