- Notifications
You must be signed in to change notification settings - Fork8
Execute command line tool from R
License
GPL-3.0, Unknown licenses found
Licenses found
WangLabCSU/blit
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
The goal ofblit is to make it easy to execute command line tool fromR.
You can installblit fromCRAN using:
install.packages("blit")Alternatively, install the development version fromGitHub with:
# install.packages("remotes")remotes::install_github("WangLabCSU/blit")
library(blit)To build acommand, simply useexec. The first argument is thecommand name, and you can also provide the full path. After that, passthe command parameters. This will create acommand object:
exec("echo","$PATH")#> <Execute: echo>
To run the command, just pass thecommand object to thecmd_run()(Note:stdout = "|" is always used in the vignette to ensure that thestandard output can be captured by knitr.)
Sys.setenv(TEST="blit is awesome")exec("echo","$TEST")|> cmd_run(stdout="|")#> Running command (2025-04-08 05:58:19): echo $TEST#>#> blit is awesome#> Running scheduled exit task#> Command process finished#> System command succeed
Alternatively, you can run it in the background. In this case, aprocess object will bereturned. For more information, refer to the official site:
proc<- exec("echo","$TEST")|> cmd_background(stdout="")proc$kill()Sys.unsetenv("TEST")
We use some tricks to capture the output from the background process.The actual implementation in the
README.Rmddiffers, but the outputremains the same.
#> Running command (2025-04-08 05:58:19): echo $TEST#> blit is awesomecmd_background() is provided for completeness. Instead of using thisfunction, we recommend usingcmd_parallel(), which can run multiplecommands in the background while ensuring that all processes areproperly cleaned up when the process exits.
# ip address are copied from quora <What are some famous IP addresses?>: https://qr.ae/pYlnbQaddress<- c("localhost","208.67.222.222","8.8.8.8","8.8.4.4")cmd_parallel(!!!lapply(address,function(ip) exec("ping",ip)),stdouts=TRUE,stdout_callbacks= lapply( seq_len(4),function(i) { force(i)function(text,proc) { sprintf("Connection %d: %s",i,text) } } ),timeouts=4,# terminate after 4sthreads=4)#> Running command (2025-04-08 05:58:19): ping localhost#> Running command (2025-04-08 05:58:19): ping 208.67.222.222#> Running command (2025-04-08 05:58:19): ping 8.8.8.8#> Running command (2025-04-08 05:58:19): ping 8.8.4.4#>#> Connection 1: PING localhost (::1) 56 data bytes#> Connection 1: 64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.017 ms#> ⠙ 0/4 [0/s] [elapsed in 76ms] @ 2025-04-08 05:58:19#> ⠹ 0/4 [0/s] [elapsed in 290ms] @ 2025-04-08 05:58:20#> ⠸ 0/4 [0/s] [elapsed in 500ms] @ 2025-04-08 05:58:20#> ⠼ 0/4 [0/s] [elapsed in 710ms] @ 2025-04-08 05:58:20#> ⠴ 0/4 [0/s] [elapsed in 929ms] @ 2025-04-08 05:58:20#> ⠦ 0/4 [0/s] [elapsed in 1.1s] @ 2025-04-08 05:58:20#> ⠧ 0/4 [0/s] [elapsed in 1.4s] @ 2025-04-08 05:58:21#> ⠇ 0/4 [0/s] [elapsed in 1.6s] @ 2025-04-08 05:58:21#> Connection 1: 64 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.042 ms#> ⠏ 0/4 [0/s] [elapsed in 1.6s] @ 2025-04-08 05:58:21#> ⠋ 0/4 [0/s] [elapsed in 1.8s] @ 2025-04-08 05:58:21#> ⠙ 0/4 [0/s] [elapsed in 2s] @ 2025-04-08 05:58:21#> ⠹ 0/4 [0/s] [elapsed in 2.2s] @ 2025-04-08 05:58:22#> Connection 1: 64 bytes from localhost (::1): icmp_seq=3 ttl=64 time=0.040 ms#> ⠸ 0/4 [0/s] [elapsed in 2.2s] @ 2025-04-08 05:58:22#> ⠼ 0/4 [0/s] [elapsed in 2.4s] @ 2025-04-08 05:58:22#> ⠴ 0/4 [0/s] [elapsed in 2.6s] @ 2025-04-08 05:58:22#> ⠦ 0/4 [0/s] [elapsed in 2.8s] @ 2025-04-08 05:58:22#> ⠧ 0/4 [0/s] [elapsed in 3s] @ 2025-04-08 05:58:22#> ⠇ 0/4 [0/s] [elapsed in 3.3s] @ 2025-04-08 05:58:23#> ⠏ 0/4 [0/s] [elapsed in 3.5s] @ 2025-04-08 05:58:23#> ⠋ 0/4 [0/s] [elapsed in 3.7s] @ 2025-04-08 05:58:23#> Connection 1: 64 bytes from localhost (::1): icmp_seq=4 ttl=64 time=0.039 ms#> ⠙ 0/4 [0/s] [elapsed in 3.7s] @ 2025-04-08 05:58:23#> ⠹ 0/4 [0/s] [elapsed in 3.9s] @ 2025-04-08 05:58:23#> ⠸ 0/4 [0/s] [elapsed in 4.1s] @ 2025-04-08 05:58:23#> ⠼ 0/4 [0/s] [elapsed in 4.1s] @ 2025-04-08 05:58:23#> Running scheduled exit task#> Command process finished#> Running scheduled exit task#> Command process finished#> Running scheduled exit task#> Command process finished#> Running scheduled exit task#> Command process finished#> ⠼ 4/4 [0.96/s] [elapsed in 4.2s] @ 2025-04-08 05:58:23#> Warning: [Command: 1] System command timed out in 4 secs (status: -9)#> Warning: [Command: 2] System command timed out in 4.1 secs (status: -9)#> Warning: [Command: 3] System command timed out in 4.1 secs (status: -9)#> Warning: [Command: 4] System command timed out in 4.1 secs (status: -9)
Theblit package provides several functions to manage and control theenvironment context:
cmd_wd: define the working directory.cmd_envvar: define the environment variables.cmd_envpath: define thePATH-like environment variables.cmd_condaenv: define thePATHenvironment variables with condaenvironment.
exec("echo","$(pwd)")|> cmd_wd(tempdir())|> cmd_run(stdout="|")#> Working Directory: '/tmp/Rtmp2bxDJx'#> Running command (2025-04-08 05:58:24): echo $(pwd)#>#> /tmp/Rtmp2bxDJx#> Running scheduled exit task#> Command process finished#> System command succeed
exec("echo","$TEST")|> cmd_envvar(TEST="blit is very awesome")|> cmd_run(stdout="|")#> Setting environment variables: TEST#> Running command (2025-04-08 05:58:24): echo $TEST#>#> blit is very awesome#> Running scheduled exit task#> Command process finished#> System command succeed
exec("echo","$PATH")|> cmd_envpath("PATH_IS_HERE",action="replace")|> cmd_run(stdout="|")#> Setting environment variables: PATH#> Running command (2025-04-08 05:58:24): echo $PATH#>#> PATH_IS_HERE#> Running scheduled exit task#> Command process finished#> System command succeed
Note:
echois a built-in command of the linux shell, so it remainsavailable even after modifying thePATHenvironment variable.
cmd_condaenv() can addconda/mamba environment prefix to thePATH environment variable.
Conda/mamba are open-source package and environment managementsystems that facilitate the installation of multiple software versionsand their dependencies. They allow easy switching between environmentsand are compatible with Linux, macOS, and Windows.
cmd_condaenv() function accepts multipleconda/mamba environmentprefixes and an optionalroot argument specifying the path to theconda/mamba root prefix. Ifroot is not provided, the functionsearches for the root in the following order:
- the option:
blit.conda.root. - the environment variable:
BLIT_CONDA_ROOT. - the root prefix of [
appmamba()] (Please see theSoftware managementsection for details).
Thecmd_condaenv() function searches for the specified environmentprefix within the providedroot path.
Theblit package integrates withmicromamba, a lightweight versionof the mamba package manager, for efficient software environmentmanagement.
You can installmicromamba withinstall_appmamba().
install_appmamba()#> Installing appmamba#> Downloading from 'https://micro.mamba.pm/api/micromamba/linux-64/latest'#> Install appmamba successfully!
Theappmamba() function executes specifiedmicromamba commands.Running it without arguments shows the help document:
appmamba()#> Running command (2025-04-08 05:58:27):#> /home/runner/.local/share/R/blit/apps/appmamba/bin/micromamba --root-prefix#> /home/runner/.local/share/R/blit/appmamba --help
To create a new environment namedsamtools and installsamtools fromBioconda, use:
appmamba("create","--yes","--name samtools","bioconda::samtools")#> Running command (2025-04-08 05:58:27):#> /home/runner/.local/share/R/blit/apps/appmamba/bin/micromamba --root-prefix#> /home/runner/.local/share/R/blit/appmamba create --yes --name samtools#> bioconda::samtools
Once the environment is created, you can execute commands within it. Thefollowing example locates the samtools binary within the specifiedenvironment:
exec("which","samtools")|> cmd_condaenv("samtools")|> cmd_run()#> Setting environment variables: PATH#> Running command (2025-04-08 05:58:39): which samtools#> Running scheduled exit task#> Command process finished#> System command succeed
You may want to clean the created environment-samtools.
appmamba("env","remove","--yes","--name samtools")#> Running command (2025-04-08 05:58:39):#> /home/runner/.local/share/R/blit/apps/appmamba/bin/micromamba --root-prefix#> /home/runner/.local/share/R/blit/appmamba env remove --yes --name samtools
For more details, please seehttps://mamba.readthedocs.io/en/latest/user_guide/micromamba.html.
Several functions allow you to schedule expressions:
cmd_on_start/cmd_on_exit: define the startup, or exit code of thecommand.cmd_on_succeed/cmd_on_fail: define the code to be run when commandsucceed or fail.
file<- tempfile()file.create(file)#> [1] TRUEfile.exists(file)#> [1] TRUEexec("ping","localhost")|> cmd_on_exit(file.remove(file))|> cmd_run(timeout=5,stdout="|")# terminate it after 5s#> Running command (2025-04-08 05:58:40): ping localhost#>#> PING localhost (::1) 56 data bytes#> 64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.016 ms#> 64 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.028 ms#> 64 bytes from localhost (::1): icmp_seq=3 ttl=64 time=0.029 ms#> 64 bytes from localhost (::1): icmp_seq=4 ttl=64 time=0.030 ms#> 64 bytes from localhost (::1): icmp_seq=5 ttl=64 time=0.030 ms#> Running scheduled exit task#> Command process finished#> Warning: System command timed out in 5 secs (status: -9)file.exists(file)#> [1] FALSE
We can also register code for succeessful or failure commandrespectively (Timeout means command fail):
file<- tempfile()file.create(file)#> [1] TRUEfile.exists(file)#> [1] TRUEexec("ping","localhost")|> cmd_on_fail(file.remove(file))|> cmd_run(timeout=5,stdout="|")# terminate it after 5s#> Running command (2025-04-08 05:58:45): ping localhost#>#> PING localhost (::1) 56 data bytes#> 64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.017 ms#> 64 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.029 ms#> 64 bytes from localhost (::1): icmp_seq=3 ttl=64 time=0.030 ms#> 64 bytes from localhost (::1): icmp_seq=4 ttl=64 time=0.035 ms#> 64 bytes from localhost (::1): icmp_seq=5 ttl=64 time=0.027 ms#> Running the scheduled failed task#> Running scheduled exit task#> Command process finished#> Warning: System command timed out in 5 secs (status: -9)file.exists(file)#> [1] FALSE
file<- tempfile()file.create(file)#> [1] TRUEfile.exists(file)#> [1] TRUEexec("ping","localhost")|> cmd_on_succeed(file.remove(file))|> cmd_run(timeout=5,stdout="|")# terminate it after 5s#> Running command (2025-04-08 05:58:50): ping localhost#>#> PING localhost (::1) 56 data bytes#> 64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.017 ms#> 64 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.029 ms#> 64 bytes from localhost (::1): icmp_seq=3 ttl=64 time=0.029 ms#> 64 bytes from localhost (::1): icmp_seq=4 ttl=64 time=0.034 ms#> 64 bytes from localhost (::1): icmp_seq=5 ttl=64 time=0.029 ms#> Running scheduled exit task#> Command process finished#> Warning: System command timed out in 5 secs (status: -9)file.exists(file)# file remain exist as timeout means command failed#> [1] TRUEfile.remove(file)#> [1] TRUE
blit provides several built-in functions for directly executingspecific commands., these include:samtools,alleleCounter,cellranger,fastq_pair,gistic2,KrakenTools,kraken2,perl,pySCENIC,python,seqkit,trust4.
For these commands, you can also usecmd_help() to print the helpdocument.
python()|> cmd_help(stdout="|")#> Running command (2025-04-08 05:58:55): /usr/bin/python --help#>#> usage: /usr/bin/python [option] ... [-c cmd | -m mod | file | -] [arg] ...#> Options (and corresponding environment variables):#> -b : issue warnings about converting bytes/bytearray to str and comparing#> bytes/bytearray with str or bytes with int. (-bb: issue errors)#> -B : don't write .pyc files on import; also PYTHONDONTWRITEBYTECODE=x#> -c cmd : program passed in as string (terminates option list)#> -d : turn on parser debugging output (for experts only, only works on#> debug builds); also PYTHONDEBUG=x#> -E : ignore PYTHON* environment variables (such as PYTHONPATH)#> -h : print this help message and exit (also -? or --help)#> -i : inspect interactively after running script; forces a prompt even#> if stdin does not appear to be a terminal; also PYTHONINSPECT=x#> -I : isolate Python from the user's environment (implies -E and -s)#> -m mod : run library module as a script (terminates option list)#> -O : remove assert and __debug__-dependent statements; add .opt-1 before#> .pyc extension; also PYTHONOPTIMIZE=x#> -OO : do -O changes and also discard docstrings; add .opt-2 before#> .pyc extension#> -P : don't prepend a potentially unsafe path to sys.path; also#> PYTHONSAFEPATH#> -q : don't print version and copyright messages on interactive startup#> -s : don't add user site directory to sys.path; also PYTHONNOUSERSITE=x#> -S : don't imply 'import site' on initialization#> -u : force the stdout and stderr streams to be unbuffered;#> this option has no effect on stdin; also PYTHONUNBUFFERED=x#> -v : verbose (trace import statements); also PYTHONVERBOSE=x#> can be supplied multiple times to increase verbosity#> -V : print the Python version number and exit (also --version)#> when given twice, print more information about the build#> -W arg : warning control; arg is action:message:category:module:lineno#> also PYTHONWARNINGS=arg#> -x : skip first line of source, allowing use of non-Unix forms of #!cmd#> -X opt : set implementation-specific option#> --check-hash-based-pycs always|default|never:#> control how Python invalidates hash-based .pyc files#> --help-env: print help about Python environment variables and exit#> --help-xoptions: print help about implementation-specific -X options and exit#> --help-all: print complete help information and exit#>#> Arguments:#> file : program read from script file#> - : program read from stdin (default; interactive mode if a tty)#> arg ...: arguments passed to program in sys.argv[1:]#> Running scheduled exit task#> Command process finished
perl()|> cmd_help(stdout="|")#> Running command (2025-04-08 05:58:55): /usr/bin/perl --help#>#>#> Usage: /usr/bin/perl [switches] [--] [programfile] [arguments]#> -0[octal/hexadecimal] specify record separator (\0, if no argument)#> -a autosplit mode with -n or -p (splits $_ into @F)#> -C[number/list] enables the listed Unicode features#> -c check syntax only (runs BEGIN and CHECK blocks)#> -d[t][:MOD] run program under debugger or module Devel::MOD#> -D[number/letters] set debugging flags (argument is a bit mask or alphabets)#> -e commandline one line of program (several -e's allowed, omit programfile)#> -E commandline like -e, but enables all optional features#> -f don't do $sitelib/sitecustomize.pl at startup#> -F/pattern/ split() pattern for -a switch (//'s are optional)#> -g read all input in one go (slurp), rather than line-by-line (alias for -0777)#> -i[extension] edit <> files in place (makes backup if extension supplied)#> -Idirectory specify @INC/#include directory (several -I's allowed)#> -l[octnum] enable line ending processing, specifies line terminator#> -[mM][-]module execute "use/no module..." before executing program#> -n assume "while (<>) { ... }" loop around program#> -p assume loop like -n but print line also, like sed#> -s enable rudimentary parsing for switches after programfile#> -S look for programfile using PATH environment variable#> -t enable tainting warnings#> -T enable tainting checks#> -u dump core after parsing program#> -U allow unsafe operations#> -v print version, patchlevel and license#> -V[:configvar] print configuration summary (or a single Config.pm variable)#> -w enable many useful warnings#> -W enable all warnings#> -x[directory] ignore text before #!perl line (optionally cd to directory)#> -X disable all warnings#>#> Run 'perldoc perl' for more help with Perl.#> Running scheduled exit task#> Command process finished
And it is very easily to extend for other commands.
One of the great features ofblit is its ability to translate the Rpipe (%>% or|>) into the Linux pipe (|). All functions used tocreate acommand object can accept anothercommand object. Theinternal will capture the first unnamed input value. If it is acommand object, it will be removed from the call and saved. When thecommand object is run, the saved command will be passed through thepipe (|) to the command. Here we take thegzip command as an example(assuming you’re using a Linux system).
tmpdir<- tempdir()file<- tempfile(tmpdir=tmpdir)writeLines(letters,con=file)file2<- tempfile()exec("gzip","-c",file)|> exec("gzip","-d",">",file2)|> cmd_run(stdout="|")#> Running command (2025-04-08 05:58:55): gzip -c /tmp/Rtmp2bxDJx/file1db163a56a5d#> | gzip -d > /tmp/Rtmp2bxDJx/file1db14f8f6795#> Running scheduled exit task#> Command process finished#> System command succeedidentical(readLines(file), readLines(file2))#> [1] TRUE
In the last we clean the temporary files.
file.remove(file)#> [1] TRUEfile.remove(file2)#> [1] TRUE
To add a new command, use themake_command function. This helperfunction is designed to assist developers in creating functions thatinitialize newcommand objects. Acommand object is a bundle ofmultipleCommand R6 objects (note the uppercase"C" inCommand,which distinguishes it from thecommand object) and the associatedrunning environment (including the working directory and environmentvariables).
Themake_command function accepts a function that initializes a newCommand object and, when necessary, validates the input arguments. Thecore purpose is to create a newCommand R6 object, so familiarity withthe R6 class system is essential.
There are several private methods or fields you may want to overridewhen creating a newCommand R6 object. The first method iscommand_locate, which determines how to locate the command path. Bydefault, it will attempt to use thecmd argument provided by the user.If nocmd argument is supplied, it will try to locate the commandusing thealias method. In most cases, you will only need to providevalues for thealias method, rather than overriding thecommand_locate method.
For example, consider theping command. Here is how you can define it:
Ping<-R6::R6Class("Ping",inherit=Command,private=list(alias=function()"ping"))ping<- make_command("ping",function(...,ping=NULL) {Ping$new(cmd=ping,...)})ping("8.8.8.8")|> cmd_run(timeout=5,stdout="|")# terminate it after 5s#> Running command (2025-04-08 05:58:55): /usr/bin/ping 8.8.8.8#> Running scheduled exit task#> Command process finished#> Warning: System command timed out in 5 secs (status: -9)
For command-line tools, the input parameters should always becharacters. The core principle of theCommand object is to convert allR objects (such as data frames) into characters—typically file paths ofR objects that have been saved to disk.
sessionInfo()#> R version 4.4.3 (2025-02-28)#> Platform: x86_64-pc-linux-gnu#> Running under: Ubuntu 24.04.2 LTS#>#> Matrix products: default#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0#>#> locale:#> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8#> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8#> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C#> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C#>#> time zone: UTC#> tzcode source: system (glibc)#>#> attached base packages:#> [1] stats graphics grDevices utils datasets methods base#>#> other attached packages:#> [1] blit_0.2.0.9000#>#> loaded via a namespace (and not attached):#> [1] digest_0.6.37 R6_2.6.1 fastmap_1.2.0 xfun_0.52#> [5] knitr_1.50 parallel_4.4.3 htmltools_0.5.8.1 rmarkdown_2.29#> [9] ps_1.9.0 cli_3.6.4 processx_3.8.6 data.table_1.17.0#> [13] compiler_4.4.3 tools_4.4.3 evaluate_1.0.3 yaml_2.3.10#> [17] rlang_1.1.5
About
Execute command line tool from R
Topics
Resources
License
GPL-3.0, Unknown licenses found
Licenses found
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors5
Uh oh!
There was an error while loading.Please reload this page.
