- Notifications
You must be signed in to change notification settings - Fork8
Generally no. We wrote thebap-barcode
module to de-barcode data from the dscATAC and dsciATAC-seq platforms, which can beexplored here. Otherwise, the input tobap
should generally be bam files that contain a SAM tag that specifies the cell / barcode and with all reads present (i.e. duplicates should ideally be present). For this, we recommend running the pre-processing tools from other workflows (e.g.CellRanger-ATAC).
In terms of user experience, very little is different, but we absolutely recommend runningbap2
once this module is installed. Both modules perform the same essential steps to nominate abundant barcodes, identify barcode multiplets, and then merge corresponding barcodes. The majority of the differences are in internal data structures. These updates enablebap2
to be 1) significantly faster 2) handle 10X scATAC data as input and 3) automatically produce fragment files for resulting datasets.
Unless you have a very specific use case of needing to reproduce results fromthis paper, usebap2
.
In other words, when you executebap
, an older version of the software (which we kept around for legacy reasons) will be executed but will be sub-optimal in terms of time, memory requirements, and output files.
When you executebap2
the most recent CLI of the software will be utilized and will provide the best user experience. We are also actively maintainingbap2
and will respond to user issues in this module.
The actual Python package name isbap-atac
, which was established for hosting onPyPi sincebap
is a common name and already taken.
This is due to philosophical differences in terms of what oligonucleotide barcode represents. In the10X standard,CB
is generally equated to be a cell barcode. Since a major component ofbap
is that we've now shown 1 barcode =/= 1 cell, we adopt a different tag logic. Namely,XB = 1 observed oligonucleotide barcode
andDB = 1 inferred droplet barcode
. In other words,DB
represents the compilation of merged barcodes after they've been identified to be barcode multiplets.
In order to make the software work with 10X scATAC-seq data (see here), one must minimally specify-bt CB
that indicates that the oligonucleotide tag is indicated in theCB
SAM tag (assuming a default execution from the CellRanger pipeline).
Pleaseraise an issue here