Movatterモバイル変換


[0]ホーム

URL:


Positional (Role) Analysis

Positional analysis groups nodes together who have similar relationalcharacteristics, rather than individual characteristics of nodesthemselves. There are many approaches to clustering in social networksbased on modularity maximization (e.g, Louvain, SLM, hierarchicalclustering) or principles of information theory (e.g, Infomap).ideanet’srole_analysis function currentlyoffers workflows for two common methods of positional analysis: CONCORand hierarchical clustering.

Note: for the sake of simplicity, we will not directly directlydisplay all visualizations generated byrole_analysis inthis vignette. However we provide code for displaying omittedvisualizations encourage you to paste this code in your R console as youfollow this vignette.

Getting Started

To illustrate how to use therole_analysis function,we’ll use a multirelational network of business and marriagerelationships between families in Renaissance-era Florence. This networkis frequently used to demonstrate role detection methods methods, and isincluded natively inideanet.

library(ideanet)
head(florentine_nodes)head(florentine_edges)
idfamily
0ACCIAIUOL
1ALBIZZI
2BARBADORI
3BISCHERI
4CASTELLAN
5GINORI
sourcetargetweighttype
081marriage
151marriage
161marriage
181marriage
241marriage
281marriage

The first step in our positional analysis workflow is to process thisnetwork using thenetwrite function, as one generally doeswhen usingideanet to work with sociocentric data:

nw_flor<-netwrite(nodelist = florentine_nodes,node_id ="id",i_elements = florentine_edges$source,j_elements = florentine_edges$target,type = florentine_edges$type,directed =FALSE,net_name ="florentine")

We’ll be passing resultingigraph_list andnode_measures object to therole_analysisfunction.

Function Arguments

As with all other tools inideanet, therole_analysis function asks users to specify severalarguments ahead of execution. Some of these arguments are specific tothe positional analysis method being used and are only required when theuser selects that method:

General Arguments

Arguments Specific to Hierarchical Clustering

Arguments Specific to CONCOR

Hierarchical Clustering

For our first example, let’s look at how to identify role positionsusing the hierarchical clustering method. Althoughrole_analysis takes the many arguments listed above, inpractice we only need to specify a fraction of them:

flor_cluster<-role_analysis(method ="cluster",graph = nw_flor$igraph_list,nodes = nw_flor$node_measures,directed =FALSE,min_partitions =2,max_partitions =7,viz =TRUE,cluster_summaries =TRUE,fast_triad =TRUE)

Note that we’ve setfast_triad to beTRUEhere to expedite counting the number of triad positions, ormotifs, that each node occupies in the network. This isacceptable for the current network given its small size; however, asstated earlier, settingfast_triad toTRUE maylead to memory issues with your computer given too large a network.Should this occur, we recommend settingfast_triad toFALSE and trying again.

role_analysis is similar tonetwrite inthat it simultaneously creates several outputs stored in a single listobject. In the following section, we’ll examine each of the outputswithin this list and what they contain.

Cluster Memberships

Depending on the amount of partitioning applied during clustering,individual nodes may vary in terms of cluster membership. Users caninspect cluster membership of individual nodes at each level ofpartitioning using thecluster_assignments object:

head(flor_cluster$cluster_assignments)
idcut_1cut_2cut_3cut_4cut_5cut_6cut_7max_modbest_fit
0111111111
1111222222
2122333333
3122333333
4122333333
5111111411

Hereid contains each node’s simplified identifier as itappears in thenode_measures dataframe produced bynetwrite. Columns beginning with thecut_prefix indicate a specific level of partitioning. In most cases, we areinterested in finding a single solution that best categorizes nodes intodifferent types (“roles”) according to their relational characteristics.role_analysis determines the optimal level of partitioningby taking the distance matrix used in the clustering process andconverting it into a similarity matrix. This similarity matrix is thentreated as a dense network whose modularity varies according to themembership of nodes within derived clusters. Finally,role_analysis designates the level of partitioning whosecluster assignments produce the highest modularity score as the bestfit. In effect, this converts a multirelational role problem into asingle-relation community detection problem in a dense network.

Cluster assignments at this identified optimal level are stored inthemax_mod column, and values in this column are generallythose that users will want to use. However, if users require clusters tohave a minimum size as specified by themin_partition_sizeargument, they will want smaller clusters identified inmax_mod to be subsumed into a parent cluster. When this isthe case, thebest_fit column will contain the closestcompromise betweenmax_mod and the user’sspecifications.

Cluster Dendrogram

To determine the number of clusters produced at the optimal level ofpartitioning, you can simply identify the maximum value contained inmax_mod. However,role_analysis generates twodiagnostic visualizations that provide a faster way of interpretingclustering output. Thecluster_dendrogram visualizationillustrates the cluster membership of nodes at each level ofpartitioning while also indicating membership of nodes at the optimalpartitioning level:

flor_cluster$cluster_dendrogram

Modularity Plot

Whilecluster_dendrogram shows where nodes fall at eachlevel of partitioning,cluster_modularity shows how themodularity score of the similarity matrix changes at each level ofpartitioning:

flor_cluster$cluster_modularity

Note: this plot may not appear in R Markdown documents, but willappear in a plot window if called in the R console.

Looking at this plot and the dendrogram together, we see that nodesin the network have been assigned to one of seven different clusters(including one isolate node; isolates are assigned their own cluster inour approach), and that this partitioning produces the best fit asdetermined by modularity score. We also see that while most clusterscontain about 2-4 nodes, node 8 appears to be unique enough in itsrelational position to constitute its own cluster.

Cluster Summaries

We now know that nodes in this network fall into one of sevenpositions or “roles.” A proper understanding of these results requiresmore, however. If clusters are supposed to represent different kinds ofroles that nodes occupy in the network, we’ll want to knowwhycertain nodes are placed in one cluster over another and how theseclusters differ from one another. Thecluster_summariesdataframe provides a numerical overview of differences between inferredclusters, allowing us to make progress to this end.

flor_cluster$cluster_summaries
clustersizemean_total_degreemean_weighted_degreemean_norm_weighted_degreemean_marriage_total_degreemean_marriage_weighted_degreemean_marriage_norm_weighted_degreemean_business_total_degreemean_business_weighted_degreemean_business_norm_weighted_degreemean_betweennessmean_marriage_betweennessmean_business_betweennessmean_bonpowmean_bonpow_negativemean_marriage_bonpowmean_marriage_bonpow_negativemean_business_bonpowmean_business_bonpow_negativemean_eigen_centralitymean_marriage_eigen_centralitymean_business_eigen_centralitymean_closenessmean_marriage_closenessmean_business_closenessmean_isolatemean_marriage_isolatemean_business_isolatemean_cor_marriage_summary_graphmean_cor_business_summary_graphmean_cor_business_marriagemean_summary_graph_201_smean_summary_graph_201_bmean_summary_graph_300mean_marriage_201_smean_marriage_201_bmean_marriage_300mean_business_201_bmean_business_201_smean_business_300mean_total_degree_stdmean_weighted_degree_stdmean_norm_weighted_degree_stdmean_marriage_total_degree_stdmean_marriage_weighted_degree_stdmean_marriage_norm_weighted_degree_stdmean_business_total_degree_stdmean_business_weighted_degree_stdmean_business_norm_weighted_degree_stdmean_betweenness_stdmean_marriage_betweenness_stdmean_business_betweenness_stdmean_bonpow_stdmean_bonpow_negative_stdmean_marriage_bonpow_stdmean_marriage_bonpow_negative_stdmean_business_bonpow_stdmean_business_bonpow_negative_stdmean_eigen_centrality_stdmean_marriage_eigen_centrality_stdmean_business_eigen_centrality_stdmean_closeness_stdmean_marriage_closeness_stdmean_business_closeness_stdmean_isolate_stdmean_marriage_isolate_stdmean_business_isolate_stdmean_cor_marriage_summary_graph_stdmean_cor_business_summary_graph_stdmean_cor_business_marriage_stdmean_summary_graph_201_s_stdmean_summary_graph_201_b_stdmean_summary_graph_300_stdmean_marriage_201_s_stdmean_marriage_201_b_stdmean_marriage_300_stdmean_business_201_b_stdmean_business_201_s_stdmean_business_300_std
132.0000002.0000000.02857141.0000001.0000000.02500001.0000001.0000000.03333330.00239130.00000000.00000000.42792040.16493950.34361820.24961240.35863790.09258870.24546820.19512410.18638810.47037040.35592590.2174074000.33333330.74020460.4899753-0.05475225.0000000.00000000.00000002.00000000.00000000.000000001.3333330.000000-1.0033651-1.0801234-1.0801234-1.1927968-1.1927968-1.1927968-0.5773503-0.5773503-0.5773503-0.6591202-0.8357033-0.5807955-1.2077500-0.7456867-1.3077063-0.6726904-0.5561957-0.6685451-1.2055180-1.3111876-0.6063701-1.0090112-1.3153252-0.2911651000.1456438-1.0041260-0.3405152-1.21415920.1086938-0.7188608-0.6370221-0.3124216-0.7186497-0.4830459-0.4711756-0.0968246-0.4605662
233.3333333.3333330.04761903.3333333.3333330.08333330.0000000.0000000.00000000.04839570.12380950.00000000.65753330.44352131.19485220.77120990.00000000.00000000.40992700.72931400.00000000.54444440.52592590.0000000001.00000001.00000000.00000000.00000005.0000002.33333330.33333334.33333332.66666670.000000000.0000000.000000-0.1672275-0.5400617-0.54006170.47711870.47711870.4771187-1.1547005-1.1547005-1.1547005-0.22587850.2089258-0.5807955-0.6435723-0.36396690.66222800.0616603-1.2020508-0.8044721-0.61440570.7042539-1.10529880.09979230.7011553-1.5172336001.60208200.9577265-1.5815831-1.03575050.1086938-0.1198101-0.31851100.52871350.5311759-0.4830459-0.4711756-0.7423218-0.4605662
334.0000006.0000000.08571432.6666672.6666670.06666673.3333333.3333330.11111110.08166390.07301590.10317461.28059150.82236910.95259050.60334231.20372400.88692390.86665170.58371470.79685450.55370370.47111110.4055556000.00000000.88853640.90545610.61138038.3333337.66666672.33333335.33333332.66666670.666666754.3333331.6666670.25084130.54006170.54006170.00000000.00000000.00000000.76980040.76980040.76980040.0874211-0.21964000.66103690.88733290.15513980.1015834-0.17467840.96568230.49759711.02719360.15492331.02774290.23839280.05096120.769895800-0.58257530.11600570.71186410.95641621.01447561.24944851.59255510.88920000.53117591.12710711.21159451.35554421.8422647
433.0000004.3333330.06190483.0000003.0000000.07500001.3333331.3333330.04444440.06100190.14126980.00000000.85890910.51293901.00599460.79964850.46435230.17939930.51591790.58940340.26943460.51851850.50000000.2925926000.00000000.93539490.85980750.62508921.6666670.66666670.00000000.66666670.66666670.000000001.0000000.000000-0.3762619-0.1350154-0.13501540.23855940.23855940.2385594-0.3849002-0.3849002-0.3849002-0.10716110.3562453-0.5807955-0.1487753-0.26884900.22517190.1016986-0.3658194-0.5411008-0.23344420.1763859-0.3840690-0.28828890.39363110.132841500-0.58257530.46985890.59623991.0010863-0.7970880-0.5477035-0.6370221-0.7930702-0.4061933-0.4830459-0.4711756-0.2581989-0.4605662
524.5000006.0000000.08571432.0000002.0000000.05000004.0000004.0000000.13333330.04276010.00952380.09285711.24889850.97402160.69672010.48718831.40027561.23506870.86966740.42355960.96178950.54722220.40500000.4138889000.00000000.79287240.89970650.45473193.5000000.00000000.00000001.50000000.00000000.000000001.5000000.0000000.56439290.54006170.5400617-0.4771187-0.4771187-0.47711871.15470051.15470051.1547005-0.2789514-0.75534730.53685370.80946060.3629378-0.4905545-0.33821021.31964351.00869921.0380326-0.44932481.46924560.1413724-0.73322570.816891600-0.5825753-0.60640380.69730100.4459814-0.2989080-0.7188608-0.6370221-0.4926648-0.7186497-0.4830459-0.4711756-0.0161374-0.4605662
618.00000011.0000000.15714296.0000006.0000000.15000005.0000005.0000000.16666670.41983560.45238100.22857141.61922072.85785441.74581772.66538761.13163702.27279950.85976971.00000000.51209590.71111110.63333330.4611111000.00000000.81946520.80103790.31333982.00000010.00000002.00000003.00000005.00000001.000000060.0000000.0000002.75925412.56529322.56529322.38559362.38559362.38559361.73205081.73205081.73205083.27211882.98121102.17034091.71937272.94421271.93727792.72845040.83586412.53216411.00245771.72552310.26549362.59460021.97518441.083201200-0.5825753-0.40558760.4473812-0.0147410-0.70650981.84849921.27404410.04806491.62477331.93218361.5481485-0.7423218-0.4605662
71NANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANANA

cluster_summaries provides both crude and standardizedaverages of the relational measures used to determine clustermembership. These include various measures of network centrality, aswell as the frequency with which nodes occupy specific positions indifferent kinds of triads that appear in the network (motifs). Rightaway, we see that the single node in cluster 6 differs from itscounterparts in other clusters. This node has a considerably higherdegree, betweenness, and closeness centrality measures, among others. Wealso see that our cluster of isolates (cluster 7) appears at the end ofthis data frame, with all of its values set toNA givenisolates’ lack of connection to other nodes in the network.

While recognized here, these differences are also visualized in thecluster_summaries_cent object. Because the network examinedhere is multirelational,cluster_summaries_cent plots thesedifferences for each unique relationship type in the network, as well asfor the overall network:

flor_cluster$cluster_summaries_cent$marriage

flor_cluster$cluster_summaries_cent$business

flor_cluster$cluster_summaries_cent$summary_graph

Those familiar with positions and motifs in networks know that asmany as 36 types of positions can exist in a network, which can beunwieldy to inspect alongside other measures. Consequently, differencesin triad positions are visualized separately incluster_summaries_triad:

flor_cluster$cluster_summaries_triad$marriage
flor_cluster$cluster_summaries_triad$business
flor_cluster$cluster_summaries_triad$summary_graph

Overall, the node in cluster 6 tends to have the highest values onmost measures used to identify roles in the network. Those familiar withthe substantive setting of this network will not be surprised to learnthat this node represents the Medici family, which was known for itspower and influence in Renaissance Florence. Additionally, nodes incluster 2 tend to appear in more clustered parts of this network due totheir business ties. If one is curious to see where the Medici andfamilies in other role positions appear relative to one another in thenetwork, one can quickly take the information contained incluster_assignments and assign it as a node-level attributein anigraph object for visualization:

igraph::V(nw_flor$florentine)$role<- flor_cluster$cluster_assignments$best_fitplot(nw_flor$florentine,vertex.color =as.factor(igraph::V(nw_flor$florentine)$role),vertex.label =NA)

Heatmaps

A final point of consideration in positional analysis involvesknowing whether nodes in a particular role tend to form ties amongthemselves or with nodes in other roles. When using hierarchicalclustering,role_analysis generates a series of heatmaps,contained in a list, to visualize the frequency of tie formation withinand between clusters. Each heatmap measures connections across clustersusing different measures, and the names of these measures are used toextract their corresponding plot from the list:

flor_cluster$cluster_relations_heatmaps$chisq# Chi-squared

flor_cluster$cluster_relations_heatmaps$density# Density
flor_cluster$cluster_relations_heatmaps$density_std# Density (Standardized)

flor_cluster$cluster_relations_heatmaps$density_centered# Density (Zero-floored)

Looking at the density-based heatmaps here, one finds a high level ofconnection between the Medici family and families belonging to cluster4. One can also see that families in cluster 2 have a high propensity tobe tied to families in cluster 5.

CONCOR

Alongside hierarchical clustering, the CONvergence of iteratedCORrelations (CONCOR) algorithm is a popular method for conductingpositional analysis in networks. Those wishing to use this algorithminstead of hierarchical clustering can easily do so using therole_analysis function. As stated before, setup for usingCONCOR is similar to that for using hierarchical clustering, with usersonly having to specify a few different arguments:

flor_concor<-role_analysis(method ="concor",graph = nw_flor$igraph_list,nodes = nw_flor$node_measures,directed =FALSE,min_partitions =1,max_partitions =4,viz =TRUE)

Using CONCOR inrole_analysis produces fewer outputs,but those that are produced resemble select items produced usinghierarchical clustering.concor_assignments, for example,appends “block” assignments to the end of thenode_measuresdata frame that the user feeds into therole_analysisfunction:

Block Memberships

flor_concor$concor_assignments%>%  dplyr::select(id, family, dplyr::starts_with("block"), best_fit)
idfamilyblock_1block_2block_3block_4best_fit
0ACCIAIUOL248132
1ALBIZZI247112
2BARBADORI248122
3BISCHERI12461
4CASTELLAN11231
5GINORI23692
6GUADAGNI12341
7LAMBERTES12451
8MEDICI23682
9PAZZI23572
10PERUZZI11221
11PUCCINANANANA3
12RIDOLFI247102
13SALVIATI248132
14STROZZI11111
15TORNABUON247112

Modularity Plot

As with the hierarchical clustering method, the optimal level ofpartitioning for CONCOR is determined according to the maximization ofmodularity in a similarity matrix. One can inspect how modularitychanges at different levels of partitioning using theconcor_modularity visualization:

flor_concor$concor_modularity

Visualizing CONCOR assignments in a conventional networkvisualization entails a similar process to that used for hierarchicalclustering.

igraph::V(nw_flor$florentine)$concor<- flor_concor$concor_assignments$best_fitplot(nw_flor$florentine,vertex.color =as.factor(igraph::V(nw_flor$florentine)$concor),vertex.label =NA)

Block Tree

In lieu of a dendrogram, users can see how smaller partitions branchoff of larger parents with theconcor_block_treevisualization. Likecluster_dendrogram, this visualizationallows users to quickly gauge the relative size of blocks inferred byCONCOR:

flor_concor$concor_block_tree

Heatmaps

Finally, users can also assess the level of connection across CONCORblocks using theconcor_relations_heatmaps object:

flor_concor$concor_relations_heatmaps$chisq

flor_concor$concor_relations_heatmaps$density
flor_concor$concor_relations_heatmaps$density_std

flor_concor$concor_relations_heatmaps$density_centered

On the whole, using CONCOR tells us that nodes in the Florentinenetwork fall into one of only two blocks (plus a third block for ourisolate), and that nodes within these roles tend to interact amongthemselves rather than with nodes in the other block. These simplerresults are less informative than those produced by the hierarchicalclustering method. But this is not to say that CONCOR is an inferiorapproach to positional analysis. Interpreting results from positionalanalysis often entails more subjectivity than other network analysismethods. Although two partitions may maximize modularity, users may findthat a higher level of partitioning produces blocks with importantsubstantive differences. Were we to accept four blocks as a moreappropriate fit than two, we see our inferred blocks start to resemblethe groups we inferred using hierarchical clustering. Moreover, thisresemblance also comes with only a small drop in modularity:

igraph::V(nw_flor$florentine)$concor2<- flor_concor$concor_assignments$block_2plot(nw_flor$florentine,vertex.color =as.factor(igraph::V(nw_flor$florentine)$concor2),vertex.label =NA)

With this in mind, we encourage users to thoroughly consider how theytreat their data when usingrole_analysis and to use theirbest judgment when interpreting its output.


[8]ページ先頭

©2009-2025 Movatter.jp