
bakR (Bayesian analysis of the kinetics of RNA) is an R package forperforming differential kinetics analysis with nucleotide recodinghigh-throughput RNA sequencing (NR-seq) data. Kinetic parameterestimation and statistical testing is compatible with mutational datafrom any enrichment free NR-seq method (e.g., TimeLapse-seq, SLAM-seq,TUC-seq, etc.).
A lot of functionality has been added, and I highly suggest all usersof bakR to update to this version. There are also many new vignettes todiscuss these new features. bakR v1.0.0 is now available forinstallation on CRAN! It is also currently available for installationfrom Github, as described below. Two major new additions are:
Differential expression analysis of RNA sequencing (RNA-seq) data canidentify changes in cellular RNA levels, but cannot determine thekinetic mechanism underlying such changes. Previously,ourlab and others addressed this shortcoming by developingnucleotide-recoding RNA-seq methods (NR-seq; e.g., TimeLapse-seq) toquantify changes in RNA synthesis and degradation kinetics. Whileadvanced statistical models implemented in user-friendly software (e.g.,DESeq2) have ensured the statistical rigor of differential expressionanalyses, no such tools that facilitate differential kinetic analysiswith NR-seq exist. To address this need, we developed bakR, an R packagethat analyzes and compares NR-seq datasets. Differential kineticsanalysis with bakR relies on a Bayesian hierarchical model of NR-seqdata to increase statistical power by sharing information acrosstranscripts. bakR outperforms attempts to use single sample analysistools (e.g., pulseR and GRAND-SLAM) for differential kinetics analysis.Check outourmanuscript in RNA to learn more about the model and its extensivevalidation!
bakR is now available on CRAN! If you are using a Mac or Windows OSthen that means you don’t need to configure a C++ compiler to installand use bakR. Those not on a Mac Windows OS will need to first properlyconfigure a C++ compiler; see the next paragraph for details and linksdescribing how to do that. In either case, once you (and your compilerif necessary) are ready, bakR can be installed as follows:
install.packages("bakR")To install the newest version of bakR from Github, you need to have aC++ compiler configured to rstan’s (the R interface to the probabilisticprogramming languageStan that bakRuses on the backend) liking. The best way to do this is to follow theStan team’shelpfuldocumentation on installing rstan for your operating system. Oncethat is complete, you can install bakR as follows:
install.packages("devtools") # if you haven't installed devtools alreadydevtools::install_github("simonlabcode/bakR")There are currently seven vignettes to help get you up to speed withusing bakR:
DissectMechanism.All vignettes are available on thebakR websiteunder the Articles section.Here is the link to thebakR github as well if you need help getting back to the github from thewebsite.
As discussed in the introductory vignette, bakR requires data in theform of a so-called “cB”, or counts binomial data frame. Each row of thecB data frame corresponds to a group of reads with identical mutationaldata, and the columns denote the sample from which the reads came, thefeature the reads aligned to, the number of mutations of interest in thereads (e.g., T-to-C mutations), the number of mutable positions(e.g. Ts), and the number of such reads. It is reasonable to wonder“where am I supposed to get this information?” While there are a couplepossibilities, perhaps the easiest and most widely applicable isbam2bakR, aSnakemake implementation of theTimeLapsepipeline developed by the Simon lab. bam2bakR takes as input alignedbam files and produces, among other things, the cB file required bybakR. Extensive documentation describing how to get bam2bakR up andrunning is available on its GitHub repo. Snakemake greatly facilitatesrunning this pipeline on almost any computational infrastructure andbam2bakR uses the conda/mamba package manager to make setting up thenecessary dependencies a breeze.
As of version 1.0.0, bakR can also take as input fraction new(sometimes referred to as new-to-total ratio, or NTR) estimates. Theseare obtainable via tools likeGRAND-SLAM,or perhaps a custom analysis pipeline that you developed while workingwith NR-seq datasets!
Post descriptions of bugs and a simple reproducible example (ifpossible) in the Issues section of this repo. In fact, you should go tothe Issues section with any question you have about bakR, and there areeven helpful labels that you can append to your posts to make the natureof your request clear. If you email me (Isaac Vock) with aquestion/concern/suggestion, I will direct you to the Issues section. Ifyou have basic use questions, I would suggest going through thevignettes linked above. If these do not answer your question, then postyour question to Issues.