Movatterモバイル変換

IBclust is an R package for clustering datasetsusing the Information Bottleneck method and its variants. This packagesupports datasets with mixed-type variables (nominal, ordinal, andcontinuous), as well as datasets that are purely continuous orcategorical. The IB approach preserves the most relevant informationwhile forming concise and interpretable clusters, guided by principlesfrom information theory.## Installation

You can install the latest version of the package directly fromGitHub usingdevtools:

install.packages("devtools")# Install devtools if not already installeddevtools::install_github("amarkos/IBclust")# Install IBclust from GitHub

Getting Started

Below is a comprehensive example demonstrating how to use the packagefor clustering mixed-type, continuous, and categorical datasets, anddisplaying the results. The examples make use of the DeterministicInformation Bottleneck (DIB) method for clustering; other optionsinclude the Agglomerative IB for hierarchical clustering, theGeneralised IB and the standard IB for fuzzy clustering.

library(IBclust)# Example Mixed-Type Datadata<-data.frame(cat_var =factor(sample(letters[1:3],100,replace =TRUE)),# Nominal categorical variableord_var =factor(sample(c("low","medium","high"),100,replace =TRUE),levels =c("low","medium","high"),ordered =TRUE),# Ordinal variablecont_var1 =rnorm(100),# Continuous variable 1cont_var2 =runif(100)# Continuous variable 2)# Perform Mixed-Type Clustering using the Deterministic variant and automatic bandwidth selectionresult_mix<-DIBmix(X = data,ncl =3)cat("Mixed-Type Clustering Results:\n")print(result_mix$Cluster)print(result_mix$Entropy)print(result_mix$MutualInfo)# Example Continuous DataX_cont<-as.data.frame(matrix(rnorm(1000),ncol =5))# 200 observations, 5 features# Perform Continuous Data Clusteringresult_cont<-DIBmix(X = X_cont,ncl =3,s =-1,nstart =50)cat("Continuous Clustering Results:\n")print(result_cont$Cluster)print(result_cont$Entropy)print(result_cont$MutualInfo)# Example Categorical DataX_cat<-data.frame(Var1 =factor(sample(letters[1:3],200,replace =TRUE)),# Nominal variableVar2 =factor(sample(letters[4:6],200,replace =TRUE)),# Nominal variableVar3 =factor(sample(c("low","medium","high"),200,replace =TRUE),levels =c("low","medium","high"),ordered =TRUE)# Ordinal variable)# Perform Categorical Data Clusteringresult_cat<-DIBmix(X = X_cat,ncl =3,lambda =-1,nstart =50)cat("Categorical Clustering Results:\n")print(result_cat$Cluster)print(result_cat$Entropy)print(result_cat$MutualInfo)

Contributing

Contributions are welcome! If you encounter issues, have suggestions,or would like to enhance the package, please feel free to submit anissue or a pull request on the GitHub repository.

Movatterモバイル変換

IBclust Package

Getting Started

Contributing

License