- Notifications
You must be signed in to change notification settings - Fork1
Rsearch: An R Interface to VSEARCH supporting visualization and parameter tuning
License
CassandraHjo/Rsearch
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Rsearch is an R package designed for handling and analyzing targetedsequencing data. The package provides a user-friendly interface for coreVSEARCH functions in addition to tools for visualization and parameteroptimization.
The core idea behindRsearch is to retain the output fromVSEARCHwithin R’s generic data structures, rather than writing results to filesas the originalVSEARCH functions. By offering this option users canchoose between working entirely within R and Rstudio or to exportresults to files asVSEARCH typically does. Keeping all results in Rdata structures allows users to leverage the power of standard datawrangling and visualization tools familiar to R users.
Another feature that enhances usability for R users is the consistentreturn format of the functions. All functions return a single table/dataframe unless the user specifies that results should be written to afile. For functions that can return multiple results - such as thosehandling read pairs with forward and reverse reads - the secondary tableis included as an attribute of the primary table. The same approachapplies to tables containing statistics from function executions. Byensuring that all functions return only one table, navigating andmanaging results become more straightforward. Additionally, since allcore functions return data frames or tibbles, they are compatible withpiping using the%>% or|> operators.
More information about attributes in R can be foundhere andhere.
Full documentation and tutorials with usage examples are available ontheRsearch website
Rsearch is available fromThe Comprehensive R Archive Network(CRAN), with thedevelopment version hosted here on GitHub.
To install the stable CRAN version ofRsearch, simply run thefollowing command in your R console:
install.packages("Rsearch")For theRsearch package to function properly on your computer,VSEARCH must be installed as well (see below). Please ensure that youare usingVSEARCH version 2.30.0 or newer.
Visit theVSEARCH GitHub site for learning moreaboutVSEARCH.
You typically installVSEARCH by simply downloading a pre-compiledbinary file to your computer (Windows or Mac). The latest release ofVSEARCH, with corresponding binaries, for installation can be foundunderReleases. On aHigh Performance Computing (HPC) cluster we prefer to use anapptainercontainer forVSEARCH. These are freely available from many sites,e.g. https://depot.galaxyproject.org/singularity/
After downloading the binary you may edit yourPATH environmentvariable to tell your operating system where to find theVSEARCHbinary. However, this is not required since theRsearch package has afunctionset_vsearch_executable() where you specify where yourVSEARCH binary file is found (seeSet correct vsearchexecutable) below.
Rsearch also relies on the Bioconductor packagephyloseq. Pleaseinstall itbefore installingRsearch if you do not already have itinstalled:
if (!requireNamespace("BiocManager",quietly=TRUE)) { install.packages("BiocManager")}BiocManager::install("phyloseq")
You can install the development version ofRsearch fromGitHub by using thedevtools package from CRAN:
if (!requireNamespace("devtools",quietly=TRUE)) { install.packages("devtools")}devtools::install_github("CassandraHjo/Rsearch")
After installation, it is a good idea to restart your R session (inRstudio:Session >Restart R) to make sure every thing is properlyloaded.
In order for most of the functions (those starting withvs_) inRsearch to work, the command to invoke VSEARCH must be set correctly.The default command is simplyvsearch, but this will only work if thefilevsearch.exe is found in a folder that is included in thePATHenvironment variable.
If this is not the case, you must tellRsearh explicitly where to findor how to invokevsearch. TheRsearch functionset_vsearch_executable() can be used to set the correct command toinvokeVSEARCH on the computer like this:
# Windows exampleRsearch::set_vsearch_executable("C:/Documents/vsearch")# If the vsearch binary (vsearch.exe) is copied to C:/Documents/ on the computer# Linux/macOS exampleRsearch::set_vsearch_executable("/usr/local/bin/vsearch")# If the vsearch binary (vsearch.exe) is copied to /usr/local/bin/ on the computer
This will store the path and use it in future sessions automatically.
AlthoughRsearch is primarily intended for local execution (as above),it is also possible to usevsearch packaged in an Apptainer orSingularity.sif container. However, sinceRsearch expects a singleexecutable path (not a full shell command), you must create a wrapperscript to bridge the container invocation.
Step by step instructions:
1. Create a wrapper script (e.g.,vsearch) with the followingcontent:
#!/bin/bashapptainerexec/path/to/vsearch_container.sifvsearch"$@"
2. Save it to a folder, for example:
/home/youruser/bin/vsearch
3. Make the script executable:
chmod+x/home/youruser/bin/vsearch
4. PointRsearch to this wrapper script:
Rsearch::set_vsearch_executable("/home/youruser/bin/")
This will makeRsearch treat the containerized version ofvsearch asa regular executable.
You may test if your executable is working properly by running thefollowing command:
Rsearch::vsearch()
If everything is set up correctly you should see a message like this:
[1] "The VSEARCH executable is: /your/path/vsearch"[1] "This is a valid command to invoke VSEARCH on this computer!"Note: For large-scale analyses andcomputationally intensive workflows, callingvsearch directly from ashell script may be more efficient than usingRsearch through R orRStudio.
Documentation can be accessed directly in the R console. Here are somemethods to access help:
- Function-specific help: To get detailed information about aspecific function, use the
?operator followed by the function name.For example, to access help for thevs_fastx_trim_filtfunction:
?vs_fastx_trim_filtAlternatively, you can use thehelp() function:
help(vs_fastx_trim_filt)- Package-wide help
To get an overview of theRsearch package an its available functions,use:
# library(Rsearch)help(package="Rsearch")
Additional usage examples can be found in thedocumentation for each individual function and on thepackage website.
library(Rsearch)# Define inputfastx_input<-"R1_sample1.fq"reverse<-"R2_sample1.fq"# Execute filtering, with tibble as outputfilt_seqs<- vs_fastx_trim_filt(fastx_input=fastx_input,reverse=reverse)# Extract tibbles with filtered sequencesR1_filt<-filt_seqsR2_filt<- attr(filt_seqs,"reverse")# Extract filtering statisticsstatistics<- attr(filt_seqs,"statistics")
The main contributors toRsearch:
- Cassandra Stamsaascassandra.stamsaas@nmbu.no (Coding, testing,documentation, maintaining)
- Lars Snipenlars.snipen@nmbu.no (Coding, documentation)
- Torbjørn Rognestorognes@ifi.uio.no (Coding, documentation)
- Hilde Vinjehilde.vinje@nmbu.no (Coding, documentation)
Please cite the following publication if you useRsearch:
xxx
Please note that citing any of the underlying algorithms, e.g.VSEARCH, may also be appropriate.
- Rognes T, Flouri T, Nichols B, Quince C, Mahé F (2016)VSEARCH: aversatile open source tool for metagenomics.PeerJ 4:e2584. doi:10.7717/peerj.2584
- The subplot of the
Rsearchlogo is created withhttps://BioRender.com
About
Rsearch: An R Interface to VSEARCH supporting visualization and parameter tuning
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Uh oh!
There was an error while loading.Please reload this page.
Contributors4
Uh oh!
There was an error while loading.Please reload this page.
