Movatterモバイル変換


[0]ホーム

URL:


CRAN_Status_BadgeDOICRAN RStudio mirror downloadsCRAN RStudio mirror downloads

FCPS

Fundamental Clustering Problems Suite

The package provides over sixty state-of-the-art clusteringalgorithms for unsupervised machine learning published in[Thrun and Stier 2021].

Table of contents

  1. Description
  2. Installation
  3. Tutorial Examples
  4. Manual
  5. Use cases
  6. Additional information
  7. References

Description

The Fundamental Clustering Problems Suite (FCPS) summaries over sixtystate-of-the-art clustering algorithms available in R language. Animportant advantage is that the input and output of clusteringalgorithms is simplified and consistent in order to enable users a swiftexecution of cluster analysis. By combining mirrored-density plots (MDplots) with statistical testing FCPS provides a tool to investigate thecluster tendency quickly prior to the cluster analysis itself[Thrun 2020].Common clustering challenges can be generated with arbitrary sample size[Thrun and Ultsch 2020a]. Additionally, FCPS sums 26indicators with the goal to estimate the number of clusters up andprovides an appropriate implementation of the clustering accuracy formore than two clusters[Thrun and Ultsch 2021]. A subset of methods was used in abenchmarking of algorithms published in[Thrun and Ultsch 2020b].

Installation

Installation using CRAN

Install automatically with all dependencies via

install.packages("FCPS",dependencies = T)# Optionally, for the automatic installation# of all suggested packages:Suggested=c("kernlab","cclust","dbscan","kohonen","MCL","ADPclust","cluster","DatabionicSwarm","orclus","subspace","flexclust","ABCanalysis","apcluster","pracma","EMCluster","pdfCluster","parallelDist","plotly","ProjectionBasedClustering","GeneralizedUmatrix","mstknnclust","densityClust","parallel","energy","R.utils","tclust","Spectrum","genie","protoclust","fastcluster","clusterability","signal","reshape2","PPCI","clustrd","smacof","rgl","prclust","dendextend","moments","prabclus","VarSelLCM","sparcl","mixtools","HDclassif","clustvarsel","knitr","rmarkdown")for(iin1:length(Suggested)) {if (!requireNamespace(Suggested[i],quietly =TRUE)) {message(paste("Installing the package", Suggested[i]))install.packages(Suggested[i],dependencies = T)  }}

Installation using Github

Please note, that dependecies have to be installed manually.

remotes::install_github("Mthrun/FCPS")

Installation using R Studio

Please note, that dependecies have to be installed manually.

Tools -> Install Packages -> Repository (CRAN) ->FCPS

Tutorial Examples

The tutorial with several examples can be found on in the vignette onCRAN:

https://cran.r-project.org/web/packages/FCPS/vignettes/FCPS.html

Manual

The full manual for users or developers is available here:https://cran.r-project.org/web/packages/FCPS/FCPS.pdf

Use Cases

Cluster Analysis ofHigh-dimensional Data

The package FCPS provides a clear and consistent access tostate-of-the-art clustering algorithms:

library(FCPS)data("Leukemia")Data=Leukemia$DistanceClassification=Leukemia$ClsClusterNo=6CA=ADPclustering(Leukemia$DistanceMatrix,ClusterNo)Cls=ClusterRenameDescendingSize(CA$Cls)ClusterPlotMDS(Data,Cls,main = ’Leukemia’,Plotter3D = ’plotly’)ClusterAccuracy(Cls,Classification)[1]0.9963899

GeneratingTypical Challenges for Clustering Algorithms

Several clustering challenge can be generated with an arbitrarysample size:

set.seed(600)library(FCPS)DataList=ClusterChallenge("Chainlink",SampleSize =750,PlotIt=TRUE)Data=DataList$ChainlinkCls=DataList$Cls>ClusterCount(Cls)$CountPerCluster$NumberOfClusters$ClusterPercentages[1]377373[1]2[1]50.2666749.73333

Cluster-Tendency

For many applications, it is crucial to decide if a dataset possessescluster structures:

library(FCPS)set.seed(600)DataList=ClusterChallenge("Chainlink",SampleSize =750)Data=DataList$ChainlinkCls=DataList$Clslibrary(ggplot2)ClusterabilityMDplot(Data)+theme_bw()

Estimation of Number ofClusters

The “FCPS” package provides up to 26 indicators to determine thenumber of clusters:

library(FCPS)set.seed(135)DataList=ClusterChallenge("Chainlink",SampleSize =900)Data=DataList$ChainlinkCls=DataList$ClsTree=HierarchicalClustering(Data,0,"SingleL")[[3]]ClusterDendrogram(Tree,4,main="Single Linkage")MaximumNumber=7clsm<-matrix(data =0,nrow =dim(Data)[1],ncol = MaximumNumber)for (iin2:(MaximumNumber+1)) {clsm[,i-1]<-cutree(Tree,i)}out=ClusterNoEstimation(Data,ClsMatrix = clsm,MaxClusterNo = MaximumNumber,PlotIt =TRUE)

Additional information

https://github.com/Mthrun/FCPS/issues
Authors websitehttp://www.deepbionics.org/
LicenseGPL-3
DependenciesR (>= 3.5.0)
Bug reports

References

  1. [Thrun/Stier, 2021] Thrun, M. C., & Stier, Q.: FundamentalClustering Algorithms Suite SoftwareX, Vol. 13(C), pp. 100642. doi10.1016/j.softx.2020.100642, 2021.
  2. [Thrun, 2020] Thrun, M. C.: Improving the Sensitivity of StatisticalTesting for Clusterability with Mirrored-Density Plot, in Archambault,D., Nabney, I. & Peltonen, J. (eds.), Machine Learning Methods inVisualisation for Big Data, DOI 10.2312/mlvis.20201102, The EurographicsAssociation, Norrköping , Sweden, May, 2020.
  3. [Thrun/Ultsch, 2020a] Thrun, M. C., & Ultsch, A.: ClusteringBenchmark Datasets Exploiting the Fundamental Clustering Problems, Datain Brief,Vol. 30(C), pp. 105501, DOI 10.1016/j.dib.2020.105501 ,2020.
  4. [Thrun/Ultsch, 2021] Thrun, M. C., and Ultsch, A.: SwarmIntelligence for Self-Organized Clustering, Artificial Intelligence,Vol. 290, pp. 103237, , 2021.
  5. [Thrun/Ultsch, 2020b] Thrun, M. C., & Ultsch, A. : UsingProjection based Clustering to Find Distance and Density based Clustersin High-Dimensional Data, Journal of Classification, , Springer,2020.

[8]ページ先頭

©2009-2025 Movatter.jp