jafarilab/NIMAAPublic

NotificationsYou must be signed in to change notification settings
Fork2
Star4

A R package for Nominal Data Mining Analysis

License

GPL-3.0 license

4 stars 2 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 126 Commits
R		R
data		data
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
NEWS.md		NEWS.md
NIMAA.Rproj		NIMAA.Rproj
NIMAA_manual_0_1_0.pdf		NIMAA_manual_0_1_0.pdf
README.Rmd		README.Rmd
README.md		README.md
cran-comments.md		cran-comments.md

Repository files navigation

NIMAA

The NIMAA package [@nimaa] provides a comprehensive set of methods forperforming nominal data mining.

It employs bipartite networks to demonstrate how two nominal variablesare linked, and then places them in the incidence matrix to proceed withnetwork analysis. NIMAA aids in characterizing the pattern of missingvalues in a dataset, locating large submatrices with non-missing values,and predicting edges within nominal variable labels. Then, given asubmatrix, two unipartite networks are constructed using various networkprojection methods. NIMAA provides a variety of choices for clusteringprojected networks and selecting the best one. The best clusteringresults can also be used as a benchmark for imputation analysis inweighted bipartite networks.

Installation

You can install the released version of NIMAA fromCRAN with:

install.packages("NIMAA")

And the development version fromGitHub with:

# install.packages("devtools")devtools::install_github("jafarilab/NIMAA")

Example

Plotting the original data

library(NIMAA)## load the beatAML databeatAML_data<-NIMAA::beatAML# plot the original databeatAML_incidence_matrix<- plotIncMatrix(x=beatAML_data,# original data with 3 columnsindex_nominal= c(2,1),# the first two columns are nominal dataindex_numeric=3,# the third column is numeric dataprint_skim=FALSE,# if you want to check the skim output, set this as TRUEplot_weight=TRUE,# when plotting the weighted incidence matrixverbose=FALSE# NOT save the figures to local folder  )#>#> Na/missing values Proportion:     0.2603

Plotting the bipartite network of the original data

plotBipartite(inc_mat=beatAML_incidence_matrix,vertex.label.display=T)

#> IGRAPH 7cf38ef UNWB 650 47636 -- #> + attr: name (v/c), type (v/l), shape (v/c), color (v/c), weight (e/n)#> + edges from 7cf38ef (vertex names):#>  [1] Alisertib (MLN8237)      --11-00261 Barasertib (AZD1152-HQPA)--11-00261#>  [3] Bortezomib (Velcade)     --11-00261 Canertinib (CI-1033)     --11-00261#>  [5] Crenolanib               --11-00261 CYT387                   --11-00261#>  [7] Dasatinib                --11-00261 Doramapimod (BIRB 796)   --11-00261#>  [9] Dovitinib (CHIR-258)     --11-00261 Erlotinib                --11-00261#> [11] Flavopiridol             --11-00261 GDC-0941                 --11-00261#> [13] Gefitinib                --11-00261 Go6976                   --11-00261#> [15] GW-2580                  --11-00261 Idelalisib               --11-00261#> + ... omitted several edges

Extracting large submatrices without missing values

TheextractSubMatrix() function extracts the submatrices that havenon-missing values or have a certain percentage of missing values inside(not for elements-max matrix), depending on the argument’s input. Thepackage vignette and help manual contain more details.

sub_matrices<- extractSubMatrix(x=beatAML_incidence_matrix,shape= c("Square","Rectangular_element_max"),# the selected shapes of submatricesrow.vars="patient_id",col.vars="inhibitor",plot_weight=TRUE,print_skim=FALSE  )#> binmatnest2.temperature#>                20.12539#> Size of Square:   96 rows x  96 columns#> Size of Rectangular_element_max:      87 rows x  140 columns

Cluster finding analysis of projected unipartite networks

ThefindCluster() function implements seven widely used networkclustering algorithms, with the option of preprocessing the inputincidence matrix following the projecting of the bipartite network intounipartite networks. Also, internal and external measurements can beused to compare clustering algorithms. Details can be found in thepackage vignette and help manual.

cls<- findCluster(sub_matrices$Rectangular_element_max,part=1,method="all",# all available clustering methodsnormalization=TRUE,# normalize the input matrixrm_weak_edges=TRUE,# remove the weak edges in the networkrm_method='delete',# delete the weak edges instead of lowering their weights to 0.threshold='median',# Use median of edges' weights as thresholdset_remaining_to_1=TRUE,# set the weights of remaining edges to 1  )#> Warning in findCluster(sub_matrices$Rectangular_element_max, part = 1, method =#> "all", : cluster_spinglass cannot work with unconnected graph#>#>#> |             |  walktrap|   louvain|   infomap| label_prop| leading_eigen| fast_greedy|#> |:------------|---------:|---------:|---------:|----------:|-------------:|-----------:|#> |modularity   | 0.0125994| 0.0825865| 0.0000000|  0.0000000|     0.0806766|   0.0825865|#> |avg.silwidth | 0.2109092| 0.1134990| 0.9785714|  0.9785714|     0.1001961|   0.1134990|#> |coverage     | 0.9200411| 0.5866393| 1.0000000|  1.0000000|     0.5806783|   0.5866393|

Edge predicting in weighted bipartite networks

ThepredictEdge() function predicts new edges between nominalvariables’ labels or imputes missing values in the input data matrixusing several imputation methods. We can compare the imputation resultsusing thevalidateEdgePrediction() function to choose the best methodbased on a predefined benchmark. The package vignette and help manualcontain more details.

imputations<- predictEdge(inc_mat=beatAML_incidence_matrix,method= c('svd','median','als','CA')  )

validateEdgePrediction(imputation=imputations,refer_community=cls$fast_greedy,clustering_args=cls$clustering_args)#>#>#> |       | Jaccard_similarity| Dice_similarity_coefficient| Rand_index| Minkowski (inversed)| Fowlkes_Mallows_index|#> |:------|------------------:|---------------------------:|----------:|--------------------:|---------------------:|#> |median |          0.7476353|                   0.8555964|  0.8628983|             1.870228|             0.8556407|#> |svd    |          0.7224792|                   0.8388829|  0.8458376|             1.763708|             0.8388853|#> |als    |          0.7599244|                   0.8635875|  0.8694758|             1.916772|             0.8635900|#> |CA     |          0.6935897|                   0.8190765|  0.8280576|             1.670030|             0.8191111|

#>   imputation_method Jaccard_similarity Dice_similarity_coefficient Rand_index#> 1            median          0.7476353                   0.8555964  0.8628983#> 2               svd          0.7224792                   0.8388829  0.8458376#> 3               als          0.7599244                   0.8635875  0.8694758#> 4                CA          0.6935897                   0.8190765  0.8280576#>   Minkowski (inversed) Fowlkes_Mallows_index#> 1             1.870228             0.8556407#> 2             1.763708             0.8388853#> 3             1.916772             0.8635900#> 4             1.670030             0.8191111

License

About

A R package for Nominal Data Mining Analysis

Releases3

v0.2.1 Latest

Apr 13, 2022

+ 2 releases

Packages

No packages published

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

NIMAA

Installation

Example

Plotting the original data

Plotting the bipartite network of the original data

Extracting large submatrices without missing values

Cluster finding analysis of projected unipartite networks

Edge predicting in weighted bipartite networks

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases3

Packages

Contributors3

Uh oh!

Languages

Movatterモバイル変換

License

jafarilab/NIMAA

Folders and files

Latest commit

History

Repository files navigation

NIMAA

Installation

Example

Plotting the original data

Plotting the bipartite network of the original data

Extracting large submatrices without missing values

Cluster finding analysis of projected unipartite networks

Edge predicting in weighted bipartite networks

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases3

Packages0

Contributors3

Uh oh!

Languages

Packages