Movatterモバイル変換


[0]ホーム

URL:


DGEobj:An S3 data object to capture results from Differential Gene Expressionanalysis

CRAN_Status_BadgeCRAN_Downloads_BadgeCircleCICodecov test coverage

DGEobj is an S3 data class that provides a flexible container forDifferential Gene Expression (DGE) analysis results. The DGEobj class isdesigned to be extensible. Thus, while designed with RNA-Seq analysisworkflows in mind, The DGE object data structure is suitable forallowing definition of new data types as needed. A set of accessoryfunctions to deposit, query and retrieve subsets of a data workflow hasbeen provided. Attributes are used to capture metadata such as speciesand gene model, including reproducibility information such that a 3rdparty can access a DGEobj history to see how each data object wascreated or modified.

Operationally, the DGEobj is styled after theRangedSummarizedExperiment (RSE). The DGEobj has data slots for row(gene), col (samples), assays (anything with n-rows by m-samplesdimensions) and metadata (anything that can’t be keyed to row, col orassay). The key motivations for creating the DGEobj data structure isthat the RSE only allows one data item each in the row and col slots andthus is unsuitable for capturing the plethora of data objects createdduring a typical DGE workflow. The DGEobj data structure can hold anynumber of row and col data objects and thus is engineered for capturingthe multiple steps of a downstream analysis.

Certain object types, primarily the count matrix and associated rowand column info, are defined as unique which means only one instance ofthat type may be added to the DGEobj.

When multiple objects of one type are included in a DGEobj (e.g. twodifferent fits), the concept of parent attributes is used to associatedownstream data objects (e.g. contrasts) with the appropriate dataobject they are derived from.

Structure of a DGEobj

A DGE obj is fundamentally a list of data objects. Each data objectdeposited in a DGEobj is accompanied by attributes including a Type, abaseType, a dateCreated and funArgs text field. Data objects that areuseful to capture as documentation of an analysis include designmatrices, fit objects, topTable output etc.

There are four fundamental and immutable baseType:row, col,assay, meta. These are used under the hood to define how to subseteach data element.

To provide flexibility there are number of predefined types (useshowTypes()) and the newType function provides extensibility tocreate new data types as needed. Each type is associated with a baseTypeand except for the unique fields described above, you can have multipleinstances of any type as long as each instance is given a unique name. Adata structure defining the DGEobj customized structure is stored as the“objDef” attribute on the DGEobj.

The funArgs text field intended to hold details from creating theobject. Passing funArgs = match.call() is a convenient way to automatecapture of the calling arguments of the current function when that bestdescribes how an object was created. The user can also supply a customuser-authored text comment for this purpose.

Supporting functions include:

Manipulate DGEObj

Query Functions

Data retrieval

Conversion

GRanges Data

If the gene data object (row annotation) contains chromosome positioninformation (name, start, end, strand), a GRanges object will also becreated.

Original Data

During initialization, a copy of the counts, gene annotation andsample annotation is duplicated and stored in the meta slot with an“_orig” suffix on the itemName. This preserves the original data if yousubset the original data. To restore select attributes including primaryparts of the originally initialized DGEobj, you can reset theobject.


[8]ページ先頭

©2009-2025 Movatter.jp