AlineTalhouk/diceRPublic

NotificationsYou must be signed in to change notification settings
Fork10
Star39

Diverse Cluster Ensemble in R

License

Unknown, MIT licenses found

Licenses found

39 stars 10 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 1,160 Commits
.github		.github
R		R
data-raw		data-raw
data		data
docs		docs
man		man
pkgdown/favicon		pkgdown/favicon
revdep		revdep
src		src
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.covrignore		.covrignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml
codecov.yml		codecov.yml
cran-comments.md		cran-comments.md
diceR.Rproj		diceR.Rproj

Repository files navigation

diceR

Overview

The goal ofdiceR is to provide a systematic framework for generatingdiverse cluster ensembles in R. There are a lot of nuances in clusteranalysis to consider. We provide a process and a suite of functions andtools to implement a systematic framework for cluster discovery, guidingthe user through the generation of a diverse clustering solutions fromdata, ensemble formation, algorithm selection and the arrival at a finalconsensus solution. We have additionally developed visual and analyticalvalidation tools to help with the assessment of the final result. Weimplemented a wrapper functiondice() that allows the user to easilyobtain results and assess them. Thus, the package is accessible to bothend user with limited statistical knowledge. Full access to the packageis available for informaticians and statisticians and the functions areeasily expanded. More details can be found in our companion paperpublished atBMCBioinformatics.

Installation

You can installdiceR from CRAN with:

install.packages("diceR")

Or get the latest development version from GitHub:

# install.packages("devtools")devtools::install_github("AlineTalhouk/diceR")

Example

The following example shows how to use the main function of the package,dice(). A data matrixhgsc contains a subset of gene expressionmeasurements of High Grade Serous Carcinoma Ovarian cancer patients fromthe Cancer Genome Atlas publicly available datasets. Samples as rows,features as columns. The function below runs the package through thedice() function. We specify (a range of)nk clusters overrepssubsamples of the data containing 80% of the full samples. We alsospecify the clusteringalgorithms to be used and the ensemblefunctions used to aggregated them incons.funs.

library(diceR)data(hgsc)obj<- dice(hgsc,nk=4,reps=5,algorithms= c("hc","diana"),cons.funs= c("kmodes","majority"),progress=FALSE,verbose=FALSE)

The first few cluster assignments are shown below:

knitr::kable(head(obj$clusters))

	kmodes	majority
TCGA.04.1331_PRO.C5	2	2
TCGA.04.1332_MES.C1	2	2
TCGA.04.1336_DIF.C4	4	2
TCGA.04.1337_MES.C1	2	2
TCGA.04.1338_MES.C1	2	2
TCGA.04.1341_PRO.C5	2	2

You can also compare the basealgorithms with thecons.funs usinginternal evaluation indices:

knitr::kable(obj$indices$ii$`4`)

	Algorithms	calinski_harabasz	dunn	pbm	gamma	c_index	davies_bouldin	mcclain_rao	sd_dis	ray_turi	silhouette	s_dbw	Compactness	Connectivity
HC_Euclidean	HC_Euclidean	3.104106	0.2608547	59.73711	0.4285714	0.2844073	1.839182	0.8009149	0.1306062	1.4765665	NaN	NaN	24.83225	41.62183
DIANA_Euclidean	DIANA_Euclidean	53.647400	0.3348103	33.87817	-1.8750000	0.1589442	2.824201	0.8051915	0.2119281	3.2978986	0.0692233	NaN	21.93396	241.66310
kmodes	kmodes	55.138600	0.3396909	50.51722	-0.6822430	0.1453599	2.006752	0.7972999	0.1170829	1.1408258	0.1253664	NaN	21.91494	201.42540
majority	majority	19.373248	0.3544371	85.05173	-1.1651376	0.2102487	1.622799	0.8019453	0.1108674	0.9200511	0.1884934	NaN	23.85408	64.04921

Pipeline

This figure is a visual schematic of the pipeline thatdice()implements.

Ensemble Clusteringpipeline.

Please visit theoverviewpage for more detail.

About

Diverse Cluster Ensemble in R

alinetalhouk.github.io/diceR/

Resources

Readme

License

Unknown, MIT licenses found

Releases24

diceR 3.1.0 Latest

Jun 20, 2025

+ 23 releases

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Licenses found

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

diceR

Overview

Installation

Example

Pipeline

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases24

Packages

Uh oh!

Contributors3

Uh oh!

Languages

Movatterモバイル変換

License

Licenses found

AlineTalhouk/diceR

Folders and files

Latest commit

History

Repository files navigation

diceR

Overview

Installation

Example

Pipeline

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases24

Packages0

Uh oh!

Contributors3

Uh oh!

Languages

Packages