Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Rsearch: An R Interface to VSEARCH supporting visualization and parameter tuning

License

NotificationsYou must be signed in to change notification settings

CassandraHjo/Rsearch

Repository files navigation

Introduction

Rsearch is an R package designed for handling and analyzing targetedsequencing data. The package provides a user-friendly interface for coreVSEARCH functions in addition to tools for visualization and parameteroptimization.

The core idea behindRsearch is to retain the output fromVSEARCHwithin R’s generic data structures, rather than writing results to filesas the originalVSEARCH functions. By offering this option users canchoose between working entirely within R and Rstudio or to exportresults to files asVSEARCH typically does. Keeping all results in Rdata structures allows users to leverage the power of standard datawrangling and visualization tools familiar to R users.

Another feature that enhances usability for R users is the consistentreturn format of the functions. All functions return a single table/dataframe unless the user specifies that results should be written to afile. For functions that can return multiple results - such as thosehandling read pairs with forward and reverse reads - the secondary tableis included as an attribute of the primary table. The same approachapplies to tables containing statistics from function executions. Byensuring that all functions return only one table, navigating andmanaging results become more straightforward. Additionally, since allcore functions return data frames or tibbles, they are compatible withpiping using the%>% or|> operators.

More information about attributes in R can be foundhere andhere.

Full documentation and tutorials with usage examples are available ontheRsearch website

Installation

Rsearch is available fromThe Comprehensive R Archive Network(CRAN), with thedevelopment version hosted here on GitHub.

To install the stable CRAN version ofRsearch, simply run thefollowing command in your R console:

install.packages("Rsearch")

Installing the development version ofRsearch

Prerequisites

For theRsearch package to function properly on your computer,VSEARCH must be installed as well (see below). Please ensure that youare usingVSEARCH version 2.30.0 or newer.

Visit theVSEARCH GitHub site for learning moreaboutVSEARCH.

InstallingVSEARCH

You typically installVSEARCH by simply downloading a pre-compiledbinary file to your computer (Windows or Mac). The latest release ofVSEARCH, with corresponding binaries, for installation can be foundunderReleases. On aHigh Performance Computing (HPC) cluster we prefer to use anapptainercontainer forVSEARCH. These are freely available from many sites,e.g. https://depot.galaxyproject.org/singularity/

After downloading the binary you may edit yourPATH environmentvariable to tell your operating system where to find theVSEARCHbinary. However, this is not required since theRsearch package has afunctionset_vsearch_executable() where you specify where yourVSEARCH binary file is found (seeSet correct vsearchexecutable) below.

InstallingRsearch

Bioconductor dependency

Rsearch also relies on the Bioconductor packagephyloseq. Pleaseinstall itbefore installingRsearch if you do not already have itinstalled:

if (!requireNamespace("BiocManager",quietly=TRUE)) {    install.packages("BiocManager")}BiocManager::install("phyloseq")

You can install the development version ofRsearch fromGitHub by using thedevtools package from CRAN:

if (!requireNamespace("devtools",quietly=TRUE)) {  install.packages("devtools")}devtools::install_github("CassandraHjo/Rsearch")

After installation, it is a good idea to restart your R session (inRstudio:Session >Restart R) to make sure every thing is properlyloaded.

Set correctVSEARCH executable

In order for most of the functions (those starting withvs_) inRsearch to work, the command to invoke VSEARCH must be set correctly.The default command is simplyvsearch, but this will only work if thefilevsearch.exe is found in a folder that is included in thePATHenvironment variable.

If this is not the case, you must tellRsearh explicitly where to findor how to invokevsearch. TheRsearch functionset_vsearch_executable() can be used to set the correct command toinvokeVSEARCH on the computer like this:

# Windows exampleRsearch::set_vsearch_executable("C:/Documents/vsearch")# If the vsearch binary (vsearch.exe) is copied to C:/Documents/ on the computer# Linux/macOS exampleRsearch::set_vsearch_executable("/usr/local/bin/vsearch")# If the vsearch binary (vsearch.exe) is copied to /usr/local/bin/ on the computer

This will store the path and use it in future sessions automatically.

Using an Apptainer/Singularity Container

AlthoughRsearch is primarily intended for local execution (as above),it is also possible to usevsearch packaged in an Apptainer orSingularity.sif container. However, sinceRsearch expects a singleexecutable path (not a full shell command), you must create a wrapperscript to bridge the container invocation.

Step by step instructions:

1. Create a wrapper script (e.g.,vsearch) with the followingcontent:

#!/bin/bashapptainerexec/path/to/vsearch_container.sifvsearch"$@"

2. Save it to a folder, for example:

/home/youruser/bin/vsearch

3. Make the script executable:

chmod+x/home/youruser/bin/vsearch

4. PointRsearch to this wrapper script:

Rsearch::set_vsearch_executable("/home/youruser/bin/")

This will makeRsearch treat the containerized version ofvsearch asa regular executable.

Test that it works

You may test if your executable is working properly by running thefollowing command:

Rsearch::vsearch()

If everything is set up correctly you should see a message like this:

[1] "The VSEARCH executable is: /your/path/vsearch"[1] "This is a valid command to invoke VSEARCH on this computer!"

Note: For large-scale analyses andcomputationally intensive workflows, callingvsearch directly from ashell script may be more efficient than usingRsearch through R orRStudio.

Documentation

Accessing help within R

Documentation can be accessed directly in the R console. Here are somemethods to access help:

  • Function-specific help: To get detailed information about aspecific function, use the? operator followed by the function name.For example, to access help for thevs_fastx_trim_filt function:
?vs_fastx_trim_filt

Alternatively, you can use thehelp() function:

help(vs_fastx_trim_filt)
  • Package-wide help

To get an overview of theRsearch package an its available functions,use:

# library(Rsearch)help(package="Rsearch")

Usage

Additional usage examples can be found in thedocumentation for each individual function and on thepackage website.

Example: Filter paired-end reads based on quality

library(Rsearch)# Define inputfastx_input<-"R1_sample1.fq"reverse<-"R2_sample1.fq"# Execute filtering, with tibble as outputfilt_seqs<- vs_fastx_trim_filt(fastx_input=fastx_input,reverse=reverse)# Extract tibbles with filtered sequencesR1_filt<-filt_seqsR2_filt<- attr(filt_seqs,"reverse")# Extract filtering statisticsstatistics<- attr(filt_seqs,"statistics")

Contributors

The main contributors toRsearch:

Citing Rsearch

Please cite the following publication if you useRsearch:

xxx

Please note that citing any of the underlying algorithms, e.g.VSEARCH, may also be appropriate.

References

  • Rognes T, Flouri T, Nichols B, Quince C, Mahé F (2016)VSEARCH: aversatile open source tool for metagenomics.PeerJ 4:e2584. doi:10.7717/peerj.2584
  • The subplot of theRsearch logo is created withhttps://BioRender.com

Contributors4

  •  
  •  
  •  
  •  

Languages


[8]ページ先頭

©2009-2025 Movatter.jp