cranlyprovides core visualizations and summaries for the CRAN packagedatabase. It is aimed mainly as an analytics tool for developers to keeptrack of their CRAN packages and profiles, as well as those of others,which, at least for me, is proving harder and harder as the CRANecosystem grows.
The package provides comprehensive methods for cleaning up andorganizing the information in the CRAN package database, for buildingpackage directives networks (depends, imports, suggests, enhances) andcollaboration networks, and for computing summaries and producinginteractive visualizations from the resulting networks. Networkvisualization is through thevisNetworkpackage. The package also provides functions to coerce the networks toigraphhttps://CRAN.R-project.org/package=igraph objects forfurther analyses and modelling.
This vignette is a tour to the current capabilities incranly.
Let’s attachcranly
library("cranly")and use an instance of the cleaned CRAN package database
cran_db<-readRDS(url("https://raw.githubusercontent.com/ikosmidis/cranly/develop/inst/extdata/cran_db.rds"))as of 2022-08-26 14:43:43 BST.
Alternatively, today’s package directives and author collaborationnetworks can be constructed by doing
p_db<- tools::CRAN_package_db()and then we need to clean and organize author names, depends,imports, suggests, enhances
cran_db<-clean_CRAN_db(p_db)The resulting dataset carries the timestamp of when it was puttogether, which helps keeping track of when the data import has takenplace and will be helpful in future versions when dynamic analyses andvisualization methods are implemented.
attr(cran_db,"timestamp")#> [1] "2022-08-26 14:43:43 BST"We can now extract edges and nodes for the CRAN package directivesnetwork by simply doing
package_network<-build_network(cran_db)and compute various statistics for the package network
## Global package network statisticspackage_summaries<-summary(package_network)Thepackage_summaries object can now be used for findingthe top-20 packages according to various statistics
plot(package_summaries,according_to ="n_authors",top =20)plot(package_summaries,according_to ="n_imports",top =20)plot(package_summaries,according_to ="n_imported_by",top =20)The names of the available statistics are
names(package_summaries)#> [1] "package" "n_authors" "n_imports" "n_imported_by"#> [5] "n_suggests" "n_suggested_by" "n_depends" "n_depended_by"#> [9] "n_enhances" "n_enhanced_by" "n_linking_to" "n_linked_by"#> [13] "betweenness" "closeness" "page_rank" "degree"#> [17] "eigen_centrality"The sub-network for my packages can be found using the extractorfunctionpackage_of which use exact matching by default
my_packages<-package_by(package_network,"Ioannis Kosmidis")my_packages#> [1] "PlackettLuce" "betareg" "brglm" "brglm2"#> [5] "detectseparation" "enrichwith" "profileModel" "trackeR"#> [9] "trackeRapp"We can now get an interactive visualization of the sub-network for mypackages using
plot(package_network,package = my_packages,title =TRUE,legend =TRUE)You can hover over the nodes and the edges to get package-specificinformation and links to the package pages.
In order tofocus only on optional packages (i.e. excludebase and recommended packages), we do
optional_packages<-subset(package_network,recommended =FALSE,base =FALSE)optional_summary<-summary(optional_packages)plot(optional_summary,top =30,according_to ="n_imported_by")Next let’s build the CRAN collaboration network
author_network<-build_network(object = cran_db,perspective ="author")Statistics for the collaboration network can be computed using thesummary method as we did for package directives.
author_summaries<-summary(author_network)The top-20 collaborators according to various network statisticsare
plot(author_summaries,according_to ="n_packages",top =20)plot(author_summaries,according_to ="page_rank",top =20)plot(author_summaries,according_to ="betweenness",top =20)The R Core’s collaboration sub-network is
plot(author_network,author ="R Core")