Movatterモバイル変換


[0]ホーム

URL:


Title:Sample Generalized Random Dot Product Graphs in Linear Time
Version:0.4.0
Description:Samples generalized random product graphs, a generalization of a broad class of network models. Given matrices X, S, and Y with with non-negative entries, samples a matrix with expectation X S Y^T and independent Poisson or Bernoulli entries using the fastRG algorithm of Rohe et al. (2017)https://www.jmlr.org/papers/v19/17-128.html. The algorithm first samples the number of edges and then puts them down one-by-one. As a result it is O(m) where m is the number of edges, a dramatic improvement over element-wise algorithms that which require O(n^2) operations to sample a random graph, where n is the number of nodes.
License:MIT + file LICENSE
URL:https://rohelab.github.io/fastRG/,https://github.com/RoheLab/fastRG
BugReports:https://github.com/RoheLab/fastRG/issues
Depends:Matrix
Imports:dplyr, ggplot2, glue, igraph, methods, rlang (≥ 1.0.0),RSpectra, stats, tibble, tidygraph, tidyr
Suggests:covr, irlba, knitr, magrittr, purrr, rmarkdown, scales,testthat (≥ 3.0.0)
Config/testthat/edition:3
Encoding:UTF-8
RoxygenNote:7.3.2
VignetteBuilder:knitr
NeedsCompilation:no
Packaged:2025-12-05 20:20:25 UTC; alex
Author:Alex HayesORCID iD [aut, cre, cph], Karl Rohe [aut, cph], Jun Tao [aut], Xintian Han [aut], Norbert Binkiewicz [aut]
Maintainer:Alex Hayes <alexpghayes@gmail.com>
Repository:CRAN
Date/Publication:2025-12-06 15:20:02 UTC

Create an undirected Chung-Lu object

Description

To specify a Chung-Lu graph, you must specifythe degree-heterogeneity parameters (vian ortheta).We provide reasonable defaults to enable rapid explorationor you can invest the effortfor more control over the model parameters. Westrongly recommendsetting theexpected_degree orexpected_density argumentto avoid large memory allocations associated withsampling large, dense graphs.

Usage

chung_lu(  n = NULL,  theta = NULL,  ...,  sort_nodes = TRUE,  poisson_edges = TRUE,  allow_self_loops = TRUE,  force_identifiability = FALSE)

Arguments

n

(degree heterogeneity) The number of nodes in the graph.Use when you don't want to specify the degree-heterogeneityparameterstheta by hand. Whenn is specified,thetais randomly generated from aLogNormal(2, 1) distribution.This is subject to change, and may not be reproducible.n defaults toNULL. You must specify eithernortheta, but not both.

theta

(degree heterogeneity) A numeric vectorexplicitly specifying the degree heterogeneityparameters. This implicitly determines the number of nodesin the resulting graph, i.e. it will havelength(theta) nodes.Must be positive. Setting to a vector of ones recoversan erdos renyi graph.Defaults toNULL. You must specify eithern ortheta,but not both.

...

Arguments passed on toundirected_factor_model

expected_degree

If specified, the desired expected degreeof the graph. Specifyingexpected_degree simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

expected_density

If specified, the desired expected densityof the graph. Specifyingexpected_density simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

sort_nodes

Logical indicating whether or not to sort the nodesso that they are grouped by block and bytheta. Useful for plotting.Defaults toTRUE.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

force_identifiability

Logical indicating whether or not tonormalizetheta such that it sums to one within each block. DefaultstoFALSE, since this behavior can be surprise whentheta is setto a vector of all ones to recover the SBM case.

Value

Anundirected_chung_lu S3 object, a subclass ofdcsbm().

See Also

Other undirected graphs:dcsbm(),erdos_renyi(),mmsbm(),overlapping_sbm(),planted_partition(),sbm()

Examples

set.seed(27)cl <- chung_lu(n = 100, expected_density = 0.01)cltheta <- round(stats::rlnorm(100, 2))cl2 <- chung_lu(  theta = theta,  expected_degree = 5)cl2edgelist <- sample_edgelist(cl)edgelist

Create an undirected degree corrected stochastic blockmodel object

Description

To specify a degree-corrected stochastic blockmodel, you must specifythe degree-heterogeneity parameters (vian ortheta),the mixing matrix (viak orB), and the relative blockprobabilities (optional, viapi). We provide defaults for most of theseoptions to enable rapid exploration, or you can invest the effortfor more control over the model parameters. Westrongly recommendsetting theexpected_degree orexpected_density argumentto avoid large memory allocations associated withsampling large, dense graphs.

Usage

dcsbm(  n = NULL,  theta = NULL,  k = NULL,  B = NULL,  ...,  block_sizes = NULL,  pi = NULL,  sort_nodes = TRUE,  force_identifiability = FALSE,  poisson_edges = TRUE,  allow_self_loops = TRUE)

Arguments

n

(degree heterogeneity) The number of nodes in the blockmodel.Use when you don't want to specify the degree-heterogeneityparameterstheta by hand. Whenn is specified,thetais randomly generated from aLogNormal(2, 1) distribution.This is subject to change, and may not be reproducible.n defaults toNULL. You must specify eithernortheta, but not both.

theta

(degree heterogeneity) A numeric vectorexplicitly specifying the degree heterogeneityparameters. This implicitly determines the number of nodesin the resulting graph, i.e. it will havelength(theta) nodes.Must be positive. Setting to a vector of ones recoversa stochastic blockmodel without degree correction.Defaults toNULL. You must specify eithern ortheta,but not both.

k

(mixing matrix) The number of blocks in the blockmodel.Use when you don't want to specify the mixing-matrix by hand.Whenk is specified, the elements ofB are drawnrandomly from aUniform(0, 1) distribution.This is subject to change, and may not be reproducible.k defaults toNULL. You must specify eitherkorB, but not both.

B

(mixing matrix) Ak byk matrix of block connectionprobabilities. The probability that a node in blocki connectsto a node in communityj isPoisson(B[i, j]). Must bea square matrix.matrix andMatrix objects are bothacceptable. IfB is not symmetric, it will besymmetrized via the updateB := B + t(B) / 2. Defaults toNULL.You must specify eitherk orB, but not both.

...

Arguments passed on toundirected_factor_model

expected_degree

If specified, the desired expected degreeof the graph. Specifyingexpected_degree simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

expected_density

If specified, the desired expected densityof the graph. Specifyingexpected_density simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

block_sizes

(block sizes) Number of nodes in each block,as a vector of integers. Must match the dimensions ofB, ork andmust sum ton. Defaults toNULL, in which case blocks are madeto be as balanced as possible. You can specify eitherpi orblock_sizes, but not both.

pi

(block sizes) Relative blockprobabilities. Must be positive, but do not need to sumto one, as they will be normalized internally.Must match the dimensions ofB ork. Defaults toNULL, in whichcase theblock_sizes argument will take precedence. Note thatyou can specify eitherpi orblock_sizes, but should notspecify both.

sort_nodes

Logical indicating whether or not to sort the nodesso that they are grouped by block and bytheta. Useful for plotting.Defaults toTRUE. WhenTRUE, nodes are first sorted by blockmembership, and then by degree-correction parameters within each block.Additionally,pi is sorted in increasing order, and the columnsof theB matrix are permuted to match the new order ofpi.

force_identifiability

Logical indicating whether or not tonormalizetheta such that it sums to one within each block. DefaultstoFALSE, since this behavior can be surprise whentheta is setto a vector of all ones to recover the SBM case.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

Value

Anundirected_dcsbm S3 object, a subclass of theundirected_factor_model() with the following additionalfields:

Generative Model

There are two levels of randomness in a degree-correctedstochastic blockmodel. First, we randomly chose a blockmembership for each node in the blockmodel. This ishandled bydcsbm(). Then, given these block memberships,we randomly sample edges between nodes. This secondoperation is handled bysample_edgelist(),sample_sparse(),sample_igraph() andsample_tidygraph(), depending depending on your desiredgraph representation.

Block memberships

Letz_i represent the block membership of nodei.To generatez_i we sample from a categoricaldistribution (note that this is a special case of amultinomial) with parameter\pi, such that\pi_i represents the probability of ending up inthe ith block. Block memberships for each node are independent.

Degree heterogeneity

In addition to block membership, the DCSBM also allowsnodes to have different propensities for edge formation.We represent this propensity for nodei by a positivenumber\theta_i. Typically the\theta_i areconstrained to sum to one for identifiability purposes,but this doesn't really matter during sampling (i.e.without the sum constraint scalingB and\thetahas the same effect on edge probabilities, but whetherB or\theta is responsible for this changeis uncertain).

Edge formulation

Once we know the block membershipsz and the degreeheterogeneity parameterstheta, we need one moreingredient, which is the baseline intensity of connectionsbetween nodes in blocki and blockj. Then each edgeA_{i,j} is Poisson distributed with parameter

\lambda[i, j] = \theta_i \cdot B_{z_i, z_j} \cdot \theta_j.

See Also

Other stochastic block models:directed_dcsbm(),mmsbm(),overlapping_sbm(),planted_partition(),sbm()

Other undirected graphs:chung_lu(),erdos_renyi(),mmsbm(),overlapping_sbm(),planted_partition(),sbm()

Examples

set.seed(27)lazy_dcsbm <- dcsbm(n = 100, k = 5, expected_density = 0.01)lazy_dcsbm# sometimes you gotta let the world burn and# sample a wildly dense graphdense_lazy_dcsbm <- dcsbm(n = 50, k = 3, expected_density = 0.8)dense_lazy_dcsbm# explicitly setting the degree heterogeneity parameter,# mixing matrix, and relative community sizes rather# than using randomly generated defaultsk <- 5n <- 100B <- matrix(stats::runif(k * k), nrow = k, ncol = k)theta <- round(stats::rlnorm(n, 2))pi <- c(1, 2, 4, 1, 1)custom_dcsbm <- dcsbm(  theta = theta,  B = B,  pi = pi,  expected_degree = 50)custom_dcsbmedgelist <- sample_edgelist(custom_dcsbm)edgelistdcsbm_explicit_block_sizes <- dcsbm(  theta = rexp(100, 1 / 3) + 1,  B = B,  block_sizes = c(13, 17, 40, 14, 16),  expected_degree = 5)# respects block sizessummary(dcsbm_explicit_block_sizes$z)# efficient eigendecompostion that leverages low-rank structure in# E(A) so that you don't have to form E(A) to find eigenvectors,# as E(A) is typically dense. computation is# handled via RSpectrapopulation_eigs <- eigs_sym(custom_dcsbm)

Create a directed degree corrected stochastic blockmodel object

Description

To specify a degree-corrected stochastic blockmodel, you must specifythe degree-heterogeneity parameters (vian ortheta_out andtheta_in), the mixing matrix(viak_out andk_in, orB), and the relative blockprobabilities (optional, viap_out andpi_in).We provide defaults for most of theseoptions to enable rapid exploration, or you can invest the effortfor more control over the model parameters. Westrongly recommendsetting theexpected_out_degree,expected_in_degree,orexpected_density argumentto avoid large memory allocations associated withsampling large, dense graphs.

Usage

directed_dcsbm(  n = NULL,  theta_out = NULL,  theta_in = NULL,  k_out = NULL,  k_in = NULL,  B = NULL,  ...,  pi_out = rep(1/k_out, k_out),  pi_in = rep(1/k_in, k_in),  sort_nodes = TRUE,  force_identifiability = TRUE,  poisson_edges = TRUE,  allow_self_loops = TRUE)

Arguments

n

(degree heterogeneity) The number of nodes in the blockmodel.Use when you don't want to specify the degree-heterogeneityparameterstheta_out andtheta_in by hand. Whenn is specified,theta_out andtheta_in are randomly generated fromaLogNormal(2, 1) distribution.This is subject to change, and may not be reproducible.n defaults toNULL. You must specify eithernortheta_out andtheta_in together, but not both.

theta_out

(degree heterogeneity) A numeric vectorexplicitly specifying the degree heterogeneityparameters. This implicitly determines the number of nodesin the resulting graph, i.e. it will havelength(theta_out) nodes.Must be positive. Setting to a vector of ones recoversa stochastic blockmodel without degree correction.Defaults toNULL. You must specify eithernortheta_out andtheta_in together, but not both.theta_outcontrols outgoing degree propensity, or, equivalently,row sums of the adjacency matrix.

theta_in

(degree heterogeneity) A numeric vectorexplicitly specifying the degree heterogeneityparameters. This implicitly determines the number of nodesin the resulting graph, i.e. it will havelength(theta) nodes.Must be positive. Setting to a vector of ones recoversa stochastic blockmodel without degree correction.Defaults toNULL. You must specify eithernortheta_out andtheta_in together, but not both.theta_incontrols incoming degree propensity, or, equivalently, columnsums of the adjacency matrix.

k_out

(mixing matrix) The number of outgoing blocks in the blockmodel.Use when you don't want to specify the mixing-matrix by hand.Whenk_out is specified, the elements ofB are drawnrandomly from aUniform(0, 1) distribution.This is subject to change, and may not be reproducible.k_out defaults toNULL. You must specify eitherk_out andk_in together, orB. You may specify all three at once, in whichcasek_out is only used to setpi_out (whenpi_out isleft at its default argument value).

k_in

(mixing matrix) The number of incoming blocks in the blockmodel.Use when you don't want to specify the mixing-matrix by hand.Whenk_in is specified, the elements ofB are drawnrandomly from aUniform(0, 1) distribution.This is subject to change, and may not be reproducible.k_in defaults toNULL. You may specify all three at once, in whichcasek_in is only used to setpi_in (whenpi_in isleft at its default argument value).

B

(mixing matrix) Ak_out byk_in matrix of block connectionprobabilities. The probability that a node in blocki connectsto a node in communityj isPoisson(B[i, j]).matrix andMatrix objects are both acceptable.Defaults toNULL. You must specify eitherk_out andk_in together, orB, but not both.

...

Arguments passed on todirected_factor_model

expected_in_degree

If specified, the desired expected in degreeof the graph. Specifyingexpected_in_degree simply rescalesSto achieve this. Defaults toNULL. Specify only one ofexpected_in_degree,expected_out_degree, andexpected_density.

expected_out_degree

If specified, the desired expected out degreeof the graph. Specifyingexpected_out_degree simply rescalesSto achieve this. Defaults toNULL. Specify only one ofexpected_in_degree,expected_out_degree, andexpected_density.

expected_density

If specified, the desired expected densityof the graph. Specifyingexpected_density simply rescalesSto achieve this. Defaults toNULL. Specify only one ofexpected_in_degree,expected_out_degree, andexpected_density.

pi_out

(relative block probabilities) Relative blockprobabilities. Must be positive, but do not need to sumto one, as they will be normalized internally.Must match the rows ofB, ork_out. Defaults torep(1 / k_out, k_out), or a balanced outgoing blocks.

pi_in

(relative block probabilities) Relative blockprobabilities. Must be positive, but do not need to sumto one, as they will be normalized internally.Must match the columns ofB, ork_in. Defaults torep(1 / k_in, k_in), or a balanced incoming blocks.

sort_nodes

Logical indicating whether or not to sort the nodesso that they are grouped by block. Useful for plotting.Defaults toTRUE. WhenTRUE, rows of the expected adjacency matrixare first sorted by outgoing block membership, and then byincoming degree-correction parameters within each incoming block.A similar sorting procedure occurs independently from the columns,according to the incoming blocks.Additionally,pi_out andpi_in are sorted in increasing order,and the columns of theB matrix are permuted to match thenew orderings.

force_identifiability

Logical indicating whether or not tonormalizetheta_out such that it sums to one within each incomingblock andtheta_in such that it sums to one within each outgoingblock. Defaults toTRUE.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

Value

Adirected_dcsbm S3 object, a subclass of thedirected_factor_model() with the following additionalfields:

Generative Model

There are two levels of randomness in a directed degree-correctedstochastic blockmodel. First, we randomly chose a incomingblock membership and an outgoing block membershipfor each node in the blockmodel. This ishandled bydirected_dcsbm(). Then, given these block memberships,we randomly sample edges between nodes. This secondoperation is handled bysample_edgelist(),sample_sparse(),sample_igraph() andsample_tidygraph(), depending on your desiredgraph representation.

Block memberships

Letx represent the incoming block membership of a nodeandy represent the outgoing block membership of a node.To generatex we sample from a categoricaldistribution with parameter\pi_out.To generatey we sample from a categoricaldistribution with parameter\pi_in.Block memberships are independent across nodes. Incoming and outgoingblock memberships of the same node are also independent.

Degree heterogeneity

In addition to block membership, the DCSBM alsonodes to have different propensities for incoming andoutgoing edge formation.We represent the propensity to form incoming edges for agiven node by a positive number\theta_out.We represent the propensity to form outgoing edges for agiven node by a positive number\theta_in.Typically the\theta_out (andtheta_in) across all nodes areconstrained to sum to one for identifiability purposes,but this doesn't really matter during sampling.

Edge formulation

Once we know the block membershipsx andyand the degree heterogeneity parameters\theta_{in} and\theta_{out}, we need one moreingredient, which is the baseline intensity of connectionsbetween nodes in blocki and blockj. Then each edge formsindependently according to a Poisson distribution withparameters

\lambda = \theta_{in} * B_{x, y} * \theta_{out}.

See Also

Other stochastic block models:dcsbm(),mmsbm(),overlapping_sbm(),planted_partition(),sbm()

Other directed graphs:directed_erdos_renyi()

Examples

set.seed(27)B <- matrix(0.2, nrow = 5, ncol = 8)diag(B) <- 0.9ddcsbm <- directed_dcsbm(  n = 100,  B = B,  k_out = 5,  k_in = 8,  expected_density = 0.01)ddcsbmpopulation_svd <- svds(ddcsbm)

Create an directed erdos renyi object

Description

Create an directed erdos renyi object

Usage

directed_erdos_renyi(  n,  ...,  p = NULL,  poisson_edges = TRUE,  allow_self_loops = TRUE)

Arguments

n

Number of nodes in graph.

...

Arguments passed on todirected_factor_model

expected_in_degree

If specified, the desired expected in degreeof the graph. Specifyingexpected_in_degree simply rescalesSto achieve this. Defaults toNULL. Specify only one ofexpected_in_degree,expected_out_degree, andexpected_density.

expected_out_degree

If specified, the desired expected out degreeof the graph. Specifyingexpected_out_degree simply rescalesSto achieve this. Defaults toNULL. Specify only one ofexpected_in_degree,expected_out_degree, andexpected_density.

p

Probability of an edge between any two nodes. You must specifyeitherp,expected_in_degree, orexpected_out_degree.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

Value

Adirected_factor_model S3 class based on a listwith the following elements:

See Also

Other erdos renyi:erdos_renyi()

Other directed graphs:directed_dcsbm()

Examples

set.seed(87)er <- directed_erdos_renyi(n = 10, p = 0.1)erbig_er <- directed_erdos_renyi(n = 1000, expected_in_degree = 5)big_erA <- sample_sparse(er)A

Create a directed factor model graph

Description

A directed factor model graph is a directedgeneralized Poisson random dot product graph. The edgesin this graph are assumpted to be independent and Poissondistributed. The graph is parameterized by its expectedadjacency matrix, with is⁠E[A] = X S Y'⁠. We do not recommendthat causal users use this function, see insteaddirected_dcsbm()and related functions, which will formulate common variantsof the stochastic blockmodels as undirected factor modelswith lots of helpful input validation.

Usage

directed_factor_model(  X,  S,  Y,  ...,  expected_in_degree = NULL,  expected_out_degree = NULL,  expected_density = NULL,  poisson_edges = TRUE,  allow_self_loops = TRUE)

Arguments

X

Amatrix() orMatrix::Matrix() representing real-valuedlatent node positions encoding community structure ofincoming edges. Entries must be positive.

S

Amatrix() orMatrix::Matrix() mixing matrix. Entriesmust be positive.

Y

Amatrix() orMatrix::Matrix() representing real-valuedlatent node positions encoding community structure ofoutgoing edges. Entries must be positive.

...

Ignored. For internal developer use only.

expected_in_degree

If specified, the desired expected in degreeof the graph. Specifyingexpected_in_degree simply rescalesSto achieve this. Defaults toNULL. Specify only one ofexpected_in_degree,expected_out_degree, andexpected_density.

expected_out_degree

If specified, the desired expected out degreeof the graph. Specifyingexpected_out_degree simply rescalesSto achieve this. Defaults toNULL. Specify only one ofexpected_in_degree,expected_out_degree, andexpected_density.

expected_density

If specified, the desired expected densityof the graph. Specifyingexpected_density simply rescalesSto achieve this. Defaults toNULL. Specify only one ofexpected_in_degree,expected_out_degree, andexpected_density.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

Value

Adirected_factor_model S3 class based on a listwith the following elements:

Examples

n <- 1000k1 <- 5k2 <- 3d <- 500X <- matrix(rpois(n = n * k1, 1), nrow = n)S <- matrix(runif(n = k1 * k2, 0, .1), nrow = k1, ncol = k2)Y <- matrix(rexp(n = k2 * d, 1), nrow = d)fm <- directed_factor_model(X, S, Y)fmfm2 <- directed_factor_model(X, S, Y, expected_in_degree = 50)fm2

Compute the eigendecomposition of the expected adjacency matrix of an undirected factor model

Description

Compute the eigendecomposition of the expected adjacency matrix of an undirected factor model

Usage

## S3 method for class 'undirected_factor_model'eigs_sym(A, k = A$k, which = "LM", sigma = NULL, opts = list(), ...)

Arguments

A

Anundirected_factor_model().

k

Desired rank of decomposition.

which

Selection criterion. SeeDetails below.

sigma

Shift parameter. See sectionShift-And-Invert Mode.

opts

Control parameters related to the computingalgorithm. SeeDetails below.

...

Unused, included only for consistency with generic signature.

Details

Thewhich argument is a character stringthat specifies the type of eigenvalues to be computed.Possible values are:

"LM" Thek eigenvalues with largest magnitude. Here the magnitude means the Euclidean norm of complex numbers.
"SM" Thek eigenvalues with smallest magnitude.
"LR" Thek eigenvalues with largest real part.
"SR" Thek eigenvalues with smallest real part.
"LI" Thek eigenvalues with largest imaginary part.
"SI" Thek eigenvalues with smallest imaginary part.
"LA" Thek largest (algebraic) eigenvalues, considering any negative sign.
"SA" Thek smallest (algebraic) eigenvalues, considering any negative sign.
"BE" Computek eigenvalues, half from each end of the spectrum. Whenk is odd, compute more from the high and then from the low end.

eigs() with matrix types "matrix", "dgeMatrix", "dgCMatrix"and "dgRMatrix" can use "LM", "SM", "LR", "SR", "LI" and "SI".

eigs_sym() with all supported matrix types,andeigs() with symmetric matrix types("dsyMatrix", "dsCMatrix", and "dsRMatrix") can use "LM", "SM", "LA", "SA" and "BE".

Theopts argument is a list that can supply any of thefollowing parameters:

ncv

Number of Lanzcos basis vectors to use. More vectorswill result in faster convergence, but with greatermemory use. For general matrix,ncv must satisfyk+2\le ncv \le n, andfor symmetric matrix, the constraint isk < ncv \le n.Default ismin(n, max(2*k+1, 20)).

tol

Precision parameter. Default is 1e-10.

maxitr

Maximum number of iterations. Default is 1000.

retvec

Whether to compute eigenvectors. If FALSE,only calculate and return eigenvalues.

initvec

Initial vector of lengthn supplied to theArnoldi/Lanczos iteration. It may speed up the convergenceifinitvec is close to an eigenvector ofA.


Create an undirected erdos renyi object

Description

Create an undirected erdos renyi object

Usage

erdos_renyi(n, ..., p = NULL, poisson_edges = TRUE, allow_self_loops = TRUE)

Arguments

n

Number of nodes in graph.

...

Arguments passed on toundirected_factor_model

expected_degree

If specified, the desired expected degreeof the graph. Specifyingexpected_degree simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

p

Probability of an edge between any two nodes. You must specifyeitherp orexpected_degree.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

Value

Anundirected_factor_model S3 class based on a listwith the following elements:

See Also

Other erdos renyi:directed_erdos_renyi()

Other undirected graphs:chung_lu(),dcsbm(),mmsbm(),overlapping_sbm(),planted_partition(),sbm()

Examples

set.seed(87)er <- erdos_renyi(n = 10, p = 0.1)erer <- erdos_renyi(n = 10, expected_density = 0.1)erbig_er <- erdos_renyi(n = 1000, expected_degree = 5)big_erA <- sample_sparse(er)A

Calculate the expected adjacency matrix

Description

Calculate the expected adjacency matrix

Usage

expectation(model, ...)## S3 method for class 'undirected_factor_model'expectation(model, ...)## S3 method for class 'directed_factor_model'expectation(model, ...)

Arguments

model

Adirected_factor_model() or anundirected_factor_model()object.

...

Unused.

Value

The expected value of the adjacency matrix, conditional on thelatent factorsX andY (if the model is directed).


Calculate the expected edges in Poisson RDPG graph

Description

These calculations are conditional on the latent factorsX andY.

Usage

expected_edges(factor_model, ...)expected_degrees(factor_model, ...)expected_degree(factor_model, ...)expected_in_degree(factor_model, ...)expected_out_degree(factor_model, ...)expected_density(factor_model, ...)

Arguments

factor_model

Adirected_factor_model() orundirected_factor_model().

...

Ignored. Do not use.

Details

Note that the runtime of thefastRG algorithm is proportional tothe expected number of edges in the graph. Expected edge count will bean underestimate of expected number of edges for Bernoulligraphs. See the Rohe et al for details.

Value

Expected edge counts, or graph densities.

References

Rohe, Karl, Jun Tao, Xintian Han, and Norbert Binkiewicz. 2017."A Note on Quickly Sampling a Sparse Matrix with Low Rank Expectation."Journal of Machine Learning Research; 19(77):1-13, 2018.https://www.jmlr.org/papers/v19/17-128.html

Examples

##### an undirected blockmodel examplen <- 100pop <- n / 2a <- .1b <- .05B <- matrix(c(a, b, b, a), nrow = 2)b_model <- sbm(n = n, B = B, poisson_edges = FALSE)b_modelA <- sample_sparse(b_model)# comparemean(rowSums(triu(A)))pop * a + pop * b # analytical average degree##### more generic examplesn <- 1000k <- 5X <- matrix(rpois(n = n * k, 1), nrow = n)S <- matrix(runif(n = k * k, 0, .1), nrow = k)ufm <- undirected_factor_model(X, S)expected_edges(ufm)expected_degree(ufm)eigs_sym(ufm)n <- 1000d <- 100k1 <- 5k2 <- 3X <- matrix(rpois(n = n * k1, 1), nrow = n)Y <- matrix(rpois(n = d * k2, 1), nrow = d)S <- matrix(runif(n = k1 * k2, 0, .1), nrow = k1)dfm <- directed_factor_model(X = X, S = S, Y = Y)expected_edges(dfm)expected_in_degree(dfm)expected_out_degree(dfm)svds(dfm)

Create an undirected degree-corrected mixed membership stochastic blockmodel object

Description

To specify a degree-corrected mixed membership stochastic blockmodel, you must specifythe degree-heterogeneity parameters (vian ortheta),the mixing matrix (viak orB), and the relative blockpropensities (optional, viaalpha). We provide defaults for most of theseoptions to enable rapid exploration, or you can invest the effortfor more control over the model parameters. Westrongly recommendsetting theexpected_degree orexpected_density argumentto avoid large memory allocations associated withsampling large, dense graphs.

Usage

mmsbm(  n = NULL,  theta = NULL,  k = NULL,  B = NULL,  ...,  alpha = rep(1, k),  sort_nodes = TRUE,  force_pure = TRUE,  poisson_edges = TRUE,  allow_self_loops = TRUE)

Arguments

n

(degree heterogeneity) The number of nodes in the blockmodel.Use when you don't want to specify the degree-heterogeneityparameterstheta by hand. Whenn is specified,thetais randomly generated from aLogNormal(2, 1) distribution.This is subject to change, and may not be reproducible.n defaults toNULL. You must specify eithernortheta, but not both.

theta

(degree heterogeneity) A numeric vectorexplicitly specifying the degree heterogeneityparameters. This implicitly determines the number of nodesin the resulting graph, i.e. it will havelength(theta) nodes.Must be positive. Setting to a vector of ones recoversa stochastic blockmodel without degree correction.Defaults toNULL. You must specify eithern ortheta,but not both.

k

(mixing matrix) The number of blocks in the blockmodel.Use when you don't want to specify the mixing-matrix by hand.Whenk is specified, the elements ofB are drawnrandomly from aUniform(0, 1) distribution.This is subject to change, and may not be reproducible.k defaults toNULL. You must specify eitherkorB, but not both.

B

(mixing matrix) Ak byk matrix of block connectionprobabilities. The probability that a node in blocki connectsto a node in communityj isPoisson(B[i, j]). Must bea square matrix.matrix andMatrix objects are bothacceptable. IfB is not symmetric, it will besymmetrized via the updateB := B + t(B) / 2. Defaults toNULL.You must specify eitherk orB, but not both.

...

Arguments passed on toundirected_factor_model

expected_degree

If specified, the desired expected degreeof the graph. Specifyingexpected_degree simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

expected_density

If specified, the desired expected densityof the graph. Specifyingexpected_density simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

alpha

(relative block propensities) Relative blockpropensities, which are parameters of a Dirichlet distribution.All elments ofalpha must thus be positive.Must match the dimensions ofB ork. Defaults torep(1, k), or balanced membership across blocks.

sort_nodes

Logical indicating whether or not to sort the nodesso that they are grouped by block and bytheta. Useful for plotting.Defaults toTRUE. WhenTRUE, nodes are first sorted by blockmembership, and then by degree-correction parameters within each block.Additionally,pi is sorted in increasing order, and the columnsof theB matrix are permuted to match the new order ofpi.

force_pure

Logical indicating whether or not to force presence of"pure nodes" (nodes that belong only to a single community) for the sakeof identifiability. To include pure nodes, block membership samplingfirst proceeds as per usual. Then, after it is complete,k nodesare chosen randomly as pure nodes, one for each block. Defaults toTRUE.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

Value

Anundirected_mmsbm S3 object, a subclass of theundirected_factor_model() with the following additionalfields:

Generative Model

There are two levels of randomness in a degree-correctedstochastic blockmodel. First, we randomly choose how mucheach node belongs to each block in the blockmodel. Each nodeis one unit of block membership to distribute. This ishandled bymmsbm(). Then, given these block memberships,we randomly sample edges between nodes. This secondoperation is handled bysample_edgelist(),sample_sparse(),sample_igraph() andsample_tidygraph(), depending depending on your desiredgraph representation.

Block memberships

LetZ_i by a vector on thek dimensional simplexrepresenting the block memberships of nodei.To generatez_i we sample from a Dirichletdistribution with parameter vector\alpha.Block memberships for each node are independent.

Degree heterogeneity

In addition to block membership, the MMSBM also allowsnodes to have different propensities for edge formation.We represent this propensity for nodei by a positivenumber\theta_i.

Edge formulation

Once we know the block membership vectorz_i, z_j and the degreeheterogeneity parameters\theta, we need one moreingredient, which is the baseline intensity of connectionsbetween nodes in blocki and blockj. This is given by ak \times k matrixB. Then each edgeA_{i,j} is Poisson distributed with parameter

\lambda_{i, j} = \theta_i \cdot z_i^T B z_j \cdot \theta_j.

See Also

Other stochastic block models:dcsbm(),directed_dcsbm(),overlapping_sbm(),planted_partition(),sbm()

Other undirected graphs:chung_lu(),dcsbm(),erdos_renyi(),overlapping_sbm(),planted_partition(),sbm()

Examples

set.seed(27)lazy_mmsbm <- mmsbm(n = 100, k = 5, expected_density = 0.01)lazy_mmsbm# sometimes you gotta let the world burn and# sample a wildly dense graphdense_lazy_mmsbm <- mmsbm(n = 500, k = 3, expected_density = 0.8)dense_lazy_mmsbm# explicitly setting the degree heterogeneity parameter,# mixing matrix, and relative community sizes rather# than using randomly generated defaultsk <- 5n <- 100B <- matrix(stats::runif(k * k), nrow = k, ncol = k)theta <- round(stats::rlnorm(n, 2))alpha <- c(1, 2, 4, 1, 1)custom_mmsbm <- mmsbm(  theta = theta,  B = B,  alpha = alpha,  expected_degree = 50)custom_mmsbmedgelist <- sample_edgelist(custom_mmsbm)edgelist# efficient eigendecompostion that leverages low-rank structure in# E(A) so that you don't have to form E(A) to find eigenvectors,# as E(A) is typically dense. computation is# handled via RSpectrapopulation_eigs <- eigs_sym(custom_mmsbm)svds(custom_mmsbm)$d

Create an undirected overlapping degree corrected stochastic blockmodel object

Description

To specify a overlapping stochastic blockmodel, you must specifythe number of nodes (vian),the mixing matrix (viak orB), and the blockprobabilities (optional, viapi). We provide defaults for most of theseoptions to enable rapid exploration, or you can invest the effortfor more control over the model parameters. Westrongly recommendsetting theexpected_degree orexpected_density argumentto avoid large memory allocations associated withsampling large, dense graphs.

Usage

overlapping_sbm(  n,  k = NULL,  B = NULL,  ...,  pi = rep(1/k, k),  sort_nodes = TRUE,  force_pure = TRUE,  poisson_edges = TRUE,  allow_self_loops = TRUE)

Arguments

n

The number of nodes in the overlapping SBM.

k

(mixing matrix) The number of blocks in the blockmodel.Use when you don't want to specify the mixing-matrix by hand.Whenk is specified,B is set to a diagonal dominant matrix withvalue0.8 along the diagonal and0.1 / (k - 1) on theoff-diagonal.k defaults toNULL. You must specify eitherkorB, but not both.

B

(mixing matrix) Ak byk matrix of block connectionprobabilities. The probability that a node in blocki connectsto a node in communityj isPoisson(B[i, j]). Must beaninvertible, symmetric square matrix.matrix andMatrix objects are bothacceptable. IfB is not symmetric, it will besymmetrized via the updateB := B + t(B). Defaults toNULL.You must specify eitherk orB, but not both.

...

Arguments passed on toundirected_factor_model

expected_degree

If specified, the desired expected degreeof the graph. Specifyingexpected_degree simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

expected_density

If specified, the desired expected densityof the graph. Specifyingexpected_density simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

pi

(block probabilities) Probability of membership in eachblock. Membership in each block is independent under theoverlapping SBM. Defaults torep(1 / k, k).

sort_nodes

Logical indicating whether or not to sort the nodesso that they are grouped by block. Useful for plotting.Defaults toTRUE.

force_pure

Logical indicating whether or not to force presence of"pure nodes" (nodes that belong only to a single community) for the sakeof identifiability. To include pure nodes, block membership samplingfirst proceeds as per usual. Then, after it is complete,k nodesare chosen randomly as pure nodes, one for each block. Defaults toTRUE.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

Value

Anundirected_overlapping_sbm S3 object, a subclass of theundirected_factor_model() with the following additionalfields:

Generative Model

There are two levels of randomness in a degree-correctedoverlapping stochastic blockmodel. First, for each node, weindependently determine if that node is a member of each block. This ishandled byoverlapping_sbm(). Then, given these block memberships,we randomly sample edges between nodes. This secondoperation is handled bysample_edgelist(),sample_sparse(),sample_igraph() andsample_tidygraph(), depending depending on your desiredgraph representation.

Identifiability

In order to be identifiable, an overlapping SBM must satisfy two conditions:

  1. B must be invertible, and

  2. the must be at least one "pure node" in each block that belongs to noother blocks.

Block memberships

Note that some nodes may not belong to any blocks.

TODO

Edge formulation

Once we know the block memberships, we need one moreingredient, which is the baseline intensity of connectionsbetween nodes in blocki and blockj. Then each edgeA_{i,j} is Poisson distributed with parameter

TODO

References

Kaufmann, Emilie, Thomas Bonald, and Marc Lelarge."A Spectral Algorithm with Additive Clustering for the Recovery ofOverlapping Communities in Networks," Vol. 9925.Lecture Notes in Computer Science.Cham: Springer International Publishing, 2016.https://doi.org/10.1007/978-3-319-46379-7.

Latouche, Pierre, Etienne Birmelé, and Christophe Ambroise."Overlapping Stochastic Block Models with Application to theFrench Political Blogosphere." The Annals of Applied Statistics 5,no. 1 (March 2011): 309–36. https://doi.org/10.1214/10-AOAS382.

Zhang, Yuan, Elizaveta Levina, and Ji Zhu. "DetectingOverlapping Communities in Networks Using Spectral Methods."ArXiv:1412.3432, December 10, 2014. http://arxiv.org/abs/1412.3432.

See Also

Other stochastic block models:dcsbm(),directed_dcsbm(),mmsbm(),planted_partition(),sbm()

Other undirected graphs:chung_lu(),dcsbm(),erdos_renyi(),mmsbm(),planted_partition(),sbm()

Examples

set.seed(27)lazy_overlapping_sbm <- overlapping_sbm(n = 1000, k = 5, expected_density = 0.01)lazy_overlapping_sbm# sometimes you gotta let the world burn and# sample a wildly dense graphdense_lazy_overlapping_sbm <- overlapping_sbm(n = 500, k = 3, expected_density = 0.8)dense_lazy_overlapping_sbmk <- 5n <- 1000B <- matrix(stats::runif(k * k), nrow = k, ncol = k)pi <- c(1, 2, 4, 1, 1) / 5custom_overlapping_sbm <- overlapping_sbm(  n = 200,  B = B,  pi = pi,  expected_degree = 5)custom_overlapping_sbmedgelist <- sample_edgelist(custom_overlapping_sbm)edgelist# efficient eigendecompostion that leverages low-rank structure in# E(A) so that you don't have to form E(A) to find eigenvectors,# as E(A) is typically dense. computation is# handled via RSpectrapopulation_eigs <- eigs_sym(custom_overlapping_sbm)

Create an undirected planted partition object

Description

To specify a planted partition model, you must specifythe number of nodes (vian), the mixing matrix (optional, either viawithin_block/between_block ora/b),and the relative block probabilites (optional, viapi).We provide defaults for most of these options to enablerapid exploration, or you can invest the effortfor more control over the model parameters. Westrongly recommendsetting theexpected_degree orexpected_density argumentto avoid large memory allocations associated withsampling large, dense graphs.

Usage

planted_partition(  n,  k,  ...,  within_block = NULL,  between_block = NULL,  a = NULL,  b = NULL,  block_sizes = NULL,  pi = NULL,  sort_nodes = TRUE,  poisson_edges = TRUE,  allow_self_loops = TRUE)

Arguments

n

The number of nodes in the network. Must bea positive integer. This argument is required.

k

Number of planted partitions, as a positive integer.This argument is required.

...

Arguments passed on toundirected_factor_model

expected_degree

If specified, the desired expected degreeof the graph. Specifyingexpected_degree simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

expected_density

If specified, the desired expected densityof the graph. Specifyingexpected_density simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

within_block

Probability of within block edges. Must bestrictly between zero and one. Must specify eitherwithin_block andbetween_block, ora andb to determineedge probabilities.

between_block

Probability of between block edges. Must bestrictly between zero and one. Must specify eitherwithin_block andbetween_block, ora andb to determineedge probabilities.

a

Integer such thata/n is the probability of edgeswithin a block. Useful for sparse graphs. Must specify eitherwithin_block andbetween_block, ora andb to determineedge probabilities.

b

Integer such thatb/n is the probability of edgesbetween blocks. Useful for sparse graphs. Must specify eitherwithin_block andbetween_block, ora andb to determineedge probabilities.

block_sizes

(block sizes) Number of nodes in each block,as a vector of integers. Must match the dimensions ofB, ork andmust sum ton. Defaults toNULL, in which case blocks are madeto be as balanced as possible. You can specify eitherpi orblock_sizes, but not both.

pi

(block sizes) Relative blockprobabilities. Must be positive, but do not need to sumto one, as they will be normalized internally.Must match the dimensions ofB ork. Defaults toNULL, in whichcase theblock_sizes argument will take precedence. Note thatyou can specify eitherpi orblock_sizes, but should notspecify both.

sort_nodes

Logical indicating whether or not to sort the nodesso that they are grouped by block and bytheta. Useful for plotting.Defaults toTRUE. WhenTRUE, nodes are first sorted by blockmembership, and then by degree-correction parameters within each block.Additionally,pi is sorted in increasing order, and the columnsof theB matrix are permuted to match the new order ofpi.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

Details

A planted partition model is stochastic blockmodel in whichthe diagonal and the off-diagonal of the mixing matrixBare both constant. This means that edge probabilitiesdepend only on whether two nodes belong to the same block,or to different blocks, but the particular blocks themselvesdon't have any impact apart from this.

Value

Anundirected_planted_partition S3 object, which is a subclassof thesbm() object, with additional fields:

See Also

Other stochastic block models:dcsbm(),directed_dcsbm(),mmsbm(),overlapping_sbm(),sbm()

Other undirected graphs:chung_lu(),dcsbm(),erdos_renyi(),mmsbm(),overlapping_sbm(),sbm()

Examples

set.seed(27)lazy_pp <- planted_partition(  n = 100,  k = 5,  expected_density = 0.01,  within_block = 0.1,  between_block = 0.01)lazy_pp

Plot (expected) adjacency matrices

Description

Plot (expected) adjacency matrices

Usage

plot_expectation(model)plot_dense_matrix(A, ...)plot_sparse_matrix(A)

Arguments

model

Adirected_factor_model() or anundirected_factor_model()object.

A

Amatrix(),Matrix::Matrix() orMatrix::sparseMatrix() object.

...

Unused.

Value

Aggplot2::ggplot2() plot.

Examples

set.seed(27)model <- dcsbm(n = 10, k = 2, expected_density = 0.2)plot_expectation(model)A <- sample_sparse(model)plot_sparse_matrix(A)

Objects exported from other packages

Description

These objects are imported from other packages. Follow the linksbelow to see their documentation.

RSpectra

eigs_sym,svds


Sample a random edgelist from a random dot product graph

Description

There are two steps to using thefastRG package. First,you must parameterize a random dot product graph bysampling the latent factors. Use functions such asdcsbm(),sbm(), etc, to perform this specification.Then, use⁠sample_*()⁠ functions to generate a random graphin your preferred format.

Usage

sample_edgelist(factor_model, ...)## S3 method for class 'undirected_factor_model'sample_edgelist(factor_model, ...)## S3 method for class 'directed_factor_model'sample_edgelist(factor_model, ...)

Arguments

factor_model

Adirected_factor_model() orundirected_factor_model().

...

Ignored. Do not use.

Details

This function implements thefastRG algorithm asdescribed in Rohe et al (2017). Please see the paper(which is short and open access!!) for details.

Value

A single realization of a random Poisson (or Bernoulli)Dot Product Graph, represented as atibble::tibble() with twointeger columns,from andto.

NOTE: Indices for isolated nodes will not appear in the edgelist!This can lead to issues if you construct network objects from theedgelist directly.

In the undirected case,from andto do not encodeinformation about edge direction, but we will always havefrom <= to for convenience of edge identification.

To avoid handling such considerations yourself, we recommend usingsample_sparse(),sample_igraph(), andsample_tidygraph()oversample_edgelist().

References

Rohe, Karl, Jun Tao, Xintian Han, and Norbert Binkiewicz. 2017."A Note on Quickly Sampling a Sparse Matrix with Low Rank Expectation."Journal of Machine Learning Research; 19(77):1-13, 2018.https://www.jmlr.org/papers/v19/17-128.html

See Also

Other samplers:sample_edgelist.matrix(),sample_igraph(),sample_sparse(),sample_tidygraph()

Examples

library(igraph)library(tidygraph)set.seed(27)##### undirected examples ----------------------------n <- 100k <- 5X <- matrix(rpois(n = n * k, 1), nrow = n)S <- matrix(runif(n = k * k, 0, .1), nrow = k)# S will be symmetrized internal here, or left unchanged if# it is already symmetricufm <- undirected_factor_model(  X, S,  expected_density = 0.1)ufm### sampling graphs as edgelists ----------------------edgelist <- sample_edgelist(ufm)edgelist### sampling graphs as sparse matrices ----------------A <- sample_sparse(ufm)inherits(A, "dsCMatrix")isSymmetric(A)dim(A)B <- sample_sparse(ufm)inherits(B, "dsCMatrix")isSymmetric(B)dim(B)### sampling graphs as igraph graphs ------------------sample_igraph(ufm)### sampling graphs as tidygraph graphs ---------------sample_tidygraph(ufm)##### directed examples ----------------------------n2 <- 100k1 <- 5k2 <- 3d <- 50X <- matrix(rpois(n = n2 * k1, 1), nrow = n2)S <- matrix(runif(n = k1 * k2, 0, .1), nrow = k1, ncol = k2)Y <- matrix(rexp(n = k2 * d, 1), nrow = d)fm <- directed_factor_model(X, S, Y, expected_in_degree = 2)fm### sampling graphs as edgelists ----------------------edgelist2 <- sample_edgelist(fm)edgelist2### sampling graphs as sparse matrices ----------------A2 <- sample_sparse(fm)inherits(A2, "dgCMatrix")isSymmetric(A2)dim(A2)B2 <- sample_sparse(fm)inherits(B2, "dgCMatrix")isSymmetric(B2)dim(B2)### sampling graphs as igraph graphs ------------------# since the number of rows and the number of columns# in `fm` differ, we will get a bipartite igraph here# creating the bipartite igraph is slow relative to other# sampling -- if this is a blocker for# you please open an issue and we can investigate speedupsdig <- sample_igraph(fm)is_bipartite(dig)### sampling graphs as tidygraph graphs ---------------sample_tidygraph(fm)

Low level interface to sample RPDG edgelists

Description

This is a brakes-off, no safety checks interface.We strongly recommend that you do not callsample_edgelist.matrix() unless you know what you are doing,and even then, we still do not recommend it, as you willbypass all typical input validation.extremely loud coughing All those who bypass inputvalidation suffer foolishly at their own hand.extremely loud coughing

Usage

## S3 method for class 'matrix'sample_edgelist(  factor_model,  S,  Y,  directed,  poisson_edges,  allow_self_loops,  ...)## S3 method for class 'Matrix'sample_edgelist(  factor_model,  S,  Y,  directed,  poisson_edges,  allow_self_loops,  ...)

Arguments

factor_model

Ann byk1matrix() orMatrix::Matrix()of latent node positions encoding incoming edge community membership.TheX matrix in Rohe et al (2017). Naming differs only forconsistency with the S3 generic.

S

Ak1 byk2 mixingmatrix() orMatrix::Matrix(). Inthe undirect case this is assumed to be symmetric butwe do notcheck that this is the case.

Y

Ad byk2matrix() orMatrix::Matrix() of latentnode positions encoding outgoing edge community membership.

directed

Logical indicating whether or not the graph should bedirected. Whendirected = FALSE, symmetrizesS internally.Y = X together with a symmetricS implies a symmetricexpectation (although not necessarily an undirected graph).Whendirected = FALSE, samples a directed graph withsymmetric expectation, and then adds edges until symmetryis achieved.

poisson_edges

Whether or not to remove duplicate edgesafter sampling. See Section 2.3 of Rohe et al. (2017)for some additional details. Defaults toTRUE.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

...

Ignored, for generic consistency only.

Details

This function implements thefastRG algorithm asdescribed in Rohe et al (2017). Please see the paper(which is short and open access!!) for details.

Value

A single realization of a random Poisson (or Bernoulli)Dot Product Graph, represented as atibble::tibble() with twointeger columns,from andto.

NOTE: Indices for isolated nodes will not appear in the edgelist!This can lead to issues if you construct network objects from theedgelist directly.

In the undirected case,from andto do not encodeinformation about edge direction, but we will always havefrom <= to for convenience of edge identification.

To avoid handling such considerations yourself, we recommend usingsample_sparse(),sample_igraph(), andsample_tidygraph()oversample_edgelist().

References

Rohe, Karl, Jun Tao, Xintian Han, and Norbert Binkiewicz. 2017."A Note on Quickly Sampling a Sparse Matrix with Low Rank Expectation."Journal of Machine Learning Research; 19(77):1-13, 2018.https://www.jmlr.org/papers/v19/17-128.html

See Also

Other samplers:sample_edgelist(),sample_igraph(),sample_sparse(),sample_tidygraph()

Examples

set.seed(46)n <- 10000d <- 1000k1 <- 5k2 <- 3X <- matrix(rpois(n = n * k1, 1), nrow = n)S <- matrix(runif(n = k1 * k2, 0, .1), nrow = k1)Y <- matrix(rpois(n = d * k2, 1), nrow = d)sample_edgelist(X, S, Y, TRUE, TRUE, TRUE)

Sample a random dot product graph as an igraph graph

Description

There are two steps to using thefastRG package. First,you must parameterize a random dot product graph bysampling the latent factors. Use functions such asdcsbm(),sbm(), etc, to perform this specification.Then, use⁠sample_*()⁠ functions to generate a random graphin your preferred format.

Usage

sample_igraph(factor_model, ...)## S3 method for class 'undirected_factor_model'sample_igraph(factor_model, ...)## S3 method for class 'directed_factor_model'sample_igraph(factor_model, ...)

Arguments

factor_model

Adirected_factor_model() orundirected_factor_model().

...

Ignored. Do not use.

Details

This function implements thefastRG algorithm asdescribed in Rohe et al (2017). Please see the paper(which is short and open access!!) for details.

Value

Anigraph::igraph() object that is possibly amultigraph (that is, we take there to be multiple edgesrather than weighted edges).

Whenfactor_model isundirected:

- the graph is undirected and one-mode.

Whenfactor_model isdirected andsquare:

- the graph is directed and one-mode.

Whenfactor_model isdirected andrectangular:

- the graph is undirected and bipartite.

Note that working with bipartite graphs inigraph is morecomplex than working with one-mode graphs.

References

Rohe, Karl, Jun Tao, Xintian Han, and Norbert Binkiewicz. 2017."A Note on Quickly Sampling a Sparse Matrix with Low Rank Expectation."Journal of Machine Learning Research; 19(77):1-13, 2018.https://www.jmlr.org/papers/v19/17-128.html

See Also

Other samplers:sample_edgelist(),sample_edgelist.matrix(),sample_sparse(),sample_tidygraph()

Examples

library(igraph)library(tidygraph)set.seed(27)##### undirected examples ----------------------------n <- 100k <- 5X <- matrix(rpois(n = n * k, 1), nrow = n)S <- matrix(runif(n = k * k, 0, .1), nrow = k)# S will be symmetrized internal here, or left unchanged if# it is already symmetricufm <- undirected_factor_model(  X, S,  expected_density = 0.1)ufm### sampling graphs as edgelists ----------------------edgelist <- sample_edgelist(ufm)edgelist### sampling graphs as sparse matrices ----------------A <- sample_sparse(ufm)inherits(A, "dsCMatrix")isSymmetric(A)dim(A)B <- sample_sparse(ufm)inherits(B, "dsCMatrix")isSymmetric(B)dim(B)### sampling graphs as igraph graphs ------------------sample_igraph(ufm)### sampling graphs as tidygraph graphs ---------------sample_tidygraph(ufm)##### directed examples ----------------------------n2 <- 100k1 <- 5k2 <- 3d <- 50X <- matrix(rpois(n = n2 * k1, 1), nrow = n2)S <- matrix(runif(n = k1 * k2, 0, .1), nrow = k1, ncol = k2)Y <- matrix(rexp(n = k2 * d, 1), nrow = d)fm <- directed_factor_model(X, S, Y, expected_in_degree = 2)fm### sampling graphs as edgelists ----------------------edgelist2 <- sample_edgelist(fm)edgelist2### sampling graphs as sparse matrices ----------------A2 <- sample_sparse(fm)inherits(A2, "dgCMatrix")isSymmetric(A2)dim(A2)B2 <- sample_sparse(fm)inherits(B2, "dgCMatrix")isSymmetric(B2)dim(B2)### sampling graphs as igraph graphs ------------------# since the number of rows and the number of columns# in `fm` differ, we will get a bipartite igraph here# creating the bipartite igraph is slow relative to other# sampling -- if this is a blocker for# you please open an issue and we can investigate speedupsdig <- sample_igraph(fm)is_bipartite(dig)### sampling graphs as tidygraph graphs ---------------sample_tidygraph(fm)

Sample a random dot product graph as a sparse Matrix

Description

There are two steps to using thefastRG package. First,you must parameterize a random dot product graph bysampling the latent factors. Use functions such asdcsbm(),sbm(), etc, to perform this specification.Then, use⁠sample_*()⁠ functions to generate a random graphin your preferred format.

Usage

sample_sparse(factor_model, ...)## S3 method for class 'undirected_factor_model'sample_sparse(factor_model, ...)## S3 method for class 'directed_factor_model'sample_sparse(factor_model, ...)

Arguments

factor_model

Adirected_factor_model() orundirected_factor_model().

...

Ignored. Do not use.

Details

This function implements thefastRG algorithm asdescribed in Rohe et al (2017). Please see the paper(which is short and open access!!) for details.

Value

For undirected factor models, a sparseMatrix::Matrix() of classdsCMatrix. In particular,this means theMatrix object (1) hasdouble data type, (2) is symmetric, and (3) is incolumn compressed storage format.

For directed factor models, a sparseMatrix::Matrix() of classdgCMatrix. This meanstheMatrix object (1) has double data type,(2) innot symmetric, and (3) is in columncompressed storage format.

To reiterate: for undirected graphs, you will geta symmetric matrix. For directed graphs, you willget a general sparse matrix.

References

Rohe, Karl, Jun Tao, Xintian Han, and Norbert Binkiewicz. 2017."A Note on Quickly Sampling a Sparse Matrix with Low Rank Expectation."Journal of Machine Learning Research; 19(77):1-13, 2018.https://www.jmlr.org/papers/v19/17-128.html

See Also

Other samplers:sample_edgelist(),sample_edgelist.matrix(),sample_igraph(),sample_tidygraph()

Examples

library(igraph)library(tidygraph)set.seed(27)##### undirected examples ----------------------------n <- 100k <- 5X <- matrix(rpois(n = n * k, 1), nrow = n)S <- matrix(runif(n = k * k, 0, .1), nrow = k)# S will be symmetrized internal here, or left unchanged if# it is already symmetricufm <- undirected_factor_model(  X, S,  expected_density = 0.1)ufm### sampling graphs as edgelists ----------------------edgelist <- sample_edgelist(ufm)edgelist### sampling graphs as sparse matrices ----------------A <- sample_sparse(ufm)inherits(A, "dsCMatrix")isSymmetric(A)dim(A)B <- sample_sparse(ufm)inherits(B, "dsCMatrix")isSymmetric(B)dim(B)### sampling graphs as igraph graphs ------------------sample_igraph(ufm)### sampling graphs as tidygraph graphs ---------------sample_tidygraph(ufm)##### directed examples ----------------------------n2 <- 100k1 <- 5k2 <- 3d <- 50X <- matrix(rpois(n = n2 * k1, 1), nrow = n2)S <- matrix(runif(n = k1 * k2, 0, .1), nrow = k1, ncol = k2)Y <- matrix(rexp(n = k2 * d, 1), nrow = d)fm <- directed_factor_model(X, S, Y, expected_in_degree = 2)fm### sampling graphs as edgelists ----------------------edgelist2 <- sample_edgelist(fm)edgelist2### sampling graphs as sparse matrices ----------------A2 <- sample_sparse(fm)inherits(A2, "dgCMatrix")isSymmetric(A2)dim(A2)B2 <- sample_sparse(fm)inherits(B2, "dgCMatrix")isSymmetric(B2)dim(B2)### sampling graphs as igraph graphs ------------------# since the number of rows and the number of columns# in `fm` differ, we will get a bipartite igraph here# creating the bipartite igraph is slow relative to other# sampling -- if this is a blocker for# you please open an issue and we can investigate speedupsdig <- sample_igraph(fm)is_bipartite(dig)### sampling graphs as tidygraph graphs ---------------sample_tidygraph(fm)

Sample a random dot product graph as a tidygraph graph

Description

There are two steps to using thefastRG package. First,you must parameterize a random dot product graph bysampling the latent factors. Use functions such asdcsbm(),sbm(), etc, to perform this specification.Then, use⁠sample_*()⁠ functions to generate a random graphin your preferred format.

Usage

sample_tidygraph(factor_model, ...)## S3 method for class 'undirected_factor_model'sample_tidygraph(factor_model, ...)## S3 method for class 'directed_factor_model'sample_tidygraph(factor_model, ...)

Arguments

factor_model

Adirected_factor_model() orundirected_factor_model().

...

Ignored. Do not use.

Details

This function implements thefastRG algorithm asdescribed in Rohe et al (2017). Please see the paper(which is short and open access!!) for details.

Value

Atidygraph::tbl_graph() object that is possibly amultigraph (that is, we take there to be multiple edgesrather than weighted edges).

Whenfactor_model isundirected:

- the graph is undirected and one-mode.

Whenfactor_model isdirected andsquare:

- the graph is directed and one-mode.

Whenfactor_model isdirected andrectangular:

- the graph is undirected and bipartite.

Note that working with bipartite graphs intidygraph is morecomplex than working with one-mode graphs.

References

Rohe, Karl, Jun Tao, Xintian Han, and Norbert Binkiewicz. 2017."A Note on Quickly Sampling a Sparse Matrix with Low Rank Expectation."Journal of Machine Learning Research; 19(77):1-13, 2018.https://www.jmlr.org/papers/v19/17-128.html

See Also

Other samplers:sample_edgelist(),sample_edgelist.matrix(),sample_igraph(),sample_sparse()

Examples

library(igraph)library(tidygraph)set.seed(27)##### undirected examples ----------------------------n <- 100k <- 5X <- matrix(rpois(n = n * k, 1), nrow = n)S <- matrix(runif(n = k * k, 0, .1), nrow = k)# S will be symmetrized internal here, or left unchanged if# it is already symmetricufm <- undirected_factor_model(  X, S,  expected_density = 0.1)ufm### sampling graphs as edgelists ----------------------edgelist <- sample_edgelist(ufm)edgelist### sampling graphs as sparse matrices ----------------A <- sample_sparse(ufm)inherits(A, "dsCMatrix")isSymmetric(A)dim(A)B <- sample_sparse(ufm)inherits(B, "dsCMatrix")isSymmetric(B)dim(B)### sampling graphs as igraph graphs ------------------sample_igraph(ufm)### sampling graphs as tidygraph graphs ---------------sample_tidygraph(ufm)##### directed examples ----------------------------n2 <- 100k1 <- 5k2 <- 3d <- 50X <- matrix(rpois(n = n2 * k1, 1), nrow = n2)S <- matrix(runif(n = k1 * k2, 0, .1), nrow = k1, ncol = k2)Y <- matrix(rexp(n = k2 * d, 1), nrow = d)fm <- directed_factor_model(X, S, Y, expected_in_degree = 2)fm### sampling graphs as edgelists ----------------------edgelist2 <- sample_edgelist(fm)edgelist2### sampling graphs as sparse matrices ----------------A2 <- sample_sparse(fm)inherits(A2, "dgCMatrix")isSymmetric(A2)dim(A2)B2 <- sample_sparse(fm)inherits(B2, "dgCMatrix")isSymmetric(B2)dim(B2)### sampling graphs as igraph graphs ------------------# since the number of rows and the number of columns# in `fm` differ, we will get a bipartite igraph here# creating the bipartite igraph is slow relative to other# sampling -- if this is a blocker for# you please open an issue and we can investigate speedupsdig <- sample_igraph(fm)is_bipartite(dig)### sampling graphs as tidygraph graphs ---------------sample_tidygraph(fm)

Create an undirected stochastic blockmodel object

Description

To specify a stochastic blockmodel, you must specifythe number of nodes (vian), the mixing matrix (viak orB),and the relative block probabilites (optional, viapi).We provide defaults for most of these options to enablerapid exploration, or you can invest the effortfor more control over the model parameters. Westrongly recommendsetting theexpected_degree orexpected_density argumentto avoid large memory allocations associated withsampling large, dense graphs.

Usage

sbm(  n,  k = NULL,  B = NULL,  ...,  block_sizes = NULL,  pi = NULL,  sort_nodes = TRUE,  poisson_edges = TRUE,  allow_self_loops = TRUE)

Arguments

n

The number of nodes in the network. Must bea positive integer. This argument is required.

k

(mixing matrix) The number of blocks in the blockmodel.Use when you don't want to specify the mixing-matrix by hand.Whenk is specified, the elements ofB are drawnrandomly from aUniform(0, 1) distribution.This is subject to change, and may not be reproducible.k defaults toNULL. You must specify eitherkorB, but not both.

B

(mixing matrix) Ak byk matrix of block connectionprobabilities. The probability that a node in blocki connectsto a node in communityj isPoisson(B[i, j]). Must bea square matrix.matrix andMatrix objects are bothacceptable. IfB is not symmetric, it will besymmetrized via the updateB := B + t(B) / 2. Defaults toNULL.You must specify eitherk orB, but not both.

...

Arguments passed on toundirected_factor_model

expected_degree

If specified, the desired expected degreeof the graph. Specifyingexpected_degree simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

expected_density

If specified, the desired expected densityof the graph. Specifyingexpected_density simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

block_sizes

(block sizes) Number of nodes in each block,as a vector of integers. Must match the dimensions ofB, ork andmust sum ton. Defaults toNULL, in which case blocks are madeto be as balanced as possible. You can specify eitherpi orblock_sizes, but not both.

pi

(block sizes) Relative blockprobabilities. Must be positive, but do not need to sumto one, as they will be normalized internally.Must match the dimensions ofB ork. Defaults toNULL, in whichcase theblock_sizes argument will take precedence. Note thatyou can specify eitherpi orblock_sizes, but should notspecify both.

sort_nodes

Logical indicating whether or not to sort the nodesso that they are grouped by block and bytheta. Useful for plotting.Defaults toTRUE. WhenTRUE, nodes are first sorted by blockmembership, and then by degree-correction parameters within each block.Additionally,pi is sorted in increasing order, and the columnsof theB matrix are permuted to match the new order ofpi.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

Details

A stochastic block is equivalent to a degree-correctedstochastic blockmodel where the degree heterogeneity parametershave all been set equal to 1.

Value

Anundirected_sbm S3 object, which is a subclass of thedcsbm() object.

See Also

Other stochastic block models:dcsbm(),directed_dcsbm(),mmsbm(),overlapping_sbm(),planted_partition()

Other undirected graphs:chung_lu(),dcsbm(),erdos_renyi(),mmsbm(),overlapping_sbm(),planted_partition()

Examples

set.seed(27)lazy_sbm <- sbm(n = 100, k = 5, expected_density = 0.01)lazy_sbm# by default we get a multigraph (i.e. multiple edges are# allowed between the same two nodes). using bernoulli edges# will with an adjacency matrix with only zeroes and onesbernoulli_sbm <- sbm(  n = 500,  k = 30,  poisson_edges = FALSE,  expected_degree = 8)bernoulli_sbmedgelist <- sample_edgelist(bernoulli_sbm)edgelistA <- sample_sparse(bernoulli_sbm)# only zeroes and ones!sign(A)# sbm with repeated eigenvalues# block sizes equal by default, needed to prevent variation in spectrum# from variation in block sizes. also need B to have a single repeated# eigenvaluerepeated_eigen <- sbm(  n = 100,  B = diag(rep(0.8, 5)),  expected_degree = 10)# exactly repeated eigenvalues in the populatione <- eigs_sym(repeated_eigen)e$values

Compute the singular value decomposition of the expected adjacency matrix of a directed factor model

Description

Compute the singular value decomposition of the expected adjacency matrix of a directed factor model

Usage

## S3 method for class 'directed_factor_model'svds(A, k = min(A$k1, A$k2), nu = k, nv = k, opts = list(), ...)

Arguments

A

Anundirected_factor_model().

k

Desired rank of decomposition.

nu

Number of left singular vectors to be computed. This mustbe between 0 andk.

nv

Number of right singular vectors to be computed. This mustbe between 0 andk.

opts

Control parameters related to the computingalgorithm. SeeDetails below.

...

Unused, included only for consistency with generic signature.

Details

Theopts argument is a list that can supply any of thefollowing parameters:

ncv

Number of Lanzcos basis vectors to use. More vectorswill result in faster convergence, but with greatermemory use.ncv must be satisfyk < ncv \le p wherep = min(m, n).Default ismin(p, max(2*k+1, 20)).

tol

Precision parameter. Default is 1e-10.

maxitr

Maximum number of iterations. Default is 1000.

center

Either a logical value (TRUE/FALSE), or a numericvector of lengthn. If a vectorc is supplied, thenSVD is computed on the matrixA - 1c',in an implicit way without actually forming this matrix.center = TRUE has the same effect ascenter = colMeans(A). Default isFALSE.

scale

Either a logical value (TRUE/FALSE), or a numericvector of lengthn. If a vectors is supplied, thenSVD is computed on the matrix(A - 1c')S,wherec is the centering vector andS = diag(1/s).Ifscale = TRUE, then the vectors is computed asthe column norm ofA - 1c'.Default isFALSE.


Compute the singular value decomposition of the expected adjacency matrix of an undirected factor model

Description

Compute the singular value decomposition of the expected adjacency matrix of an undirected factor model

Usage

## S3 method for class 'undirected_factor_model'svds(A, k = A$k, nu = k, nv = k, opts = list(), ...)

Arguments

A

Anundirected_factor_model().

k

Desired rank of decomposition.

nu

Number of left singular vectors to be computed. This mustbe between 0 andk.

nv

Number of right singular vectors to be computed. This mustbe between 0 andk.

opts

Control parameters related to the computingalgorithm. SeeDetails below.

...

Unused, included only for consistency with generic signature.

Details

Theopts argument is a list that can supply any of thefollowing parameters:

ncv

Number of Lanzcos basis vectors to use. More vectorswill result in faster convergence, but with greatermemory use.ncv must be satisfyk < ncv \le p wherep = min(m, n).Default ismin(p, max(2*k+1, 20)).

tol

Precision parameter. Default is 1e-10.

maxitr

Maximum number of iterations. Default is 1000.

center

Either a logical value (TRUE/FALSE), or a numericvector of lengthn. If a vectorc is supplied, thenSVD is computed on the matrixA - 1c',in an implicit way without actually forming this matrix.center = TRUE has the same effect ascenter = colMeans(A). Default isFALSE.

scale

Either a logical value (TRUE/FALSE), or a numericvector of lengthn. If a vectors is supplied, thenSVD is computed on the matrix(A - 1c')S,wherec is the centering vector andS = diag(1/s).Ifscale = TRUE, then the vectors is computed asthe column norm ofA - 1c'.Default isFALSE.


Create an undirected factor model graph

Description

An undirected factor model graph is an undirectedgeneralized Poisson random dot product graph. The edgesin this graph are assumed to be independent and Poissondistributed. The graph is parameterized by its expectedadjacency matrix, which is⁠E[A|X] = X S X'⁠. We do not recommendthat casual users use this function, see insteaddcsbm()and related functions, which will formulate common variantsof the stochastic blockmodels as undirected factor modelswith lots of helpful input validation.

Usage

undirected_factor_model(  X,  S,  ...,  expected_degree = NULL,  expected_density = NULL,  poisson_edges = TRUE,  allow_self_loops = TRUE)

Arguments

X

Amatrix() orMatrix::Matrix() representing real-valuedlatent node positions. Entries must be positive.

S

Amatrix() orMatrix::Matrix() mixing matrix.S issymmetrized if it is not already, as this is the undirectedcase. Entries must be positive.

...

Ignored. Must be empty.

expected_degree

If specified, the desired expected degreeof the graph. Specifyingexpected_degree simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

expected_density

If specified, the desired expected densityof the graph. Specifyingexpected_density simply rescalesSto achieve this. Defaults toNULL. Do not specify bothexpected_degree andexpected_density at the same time.

poisson_edges

Logical indicating whether or notmultiple edges are allowed to form between a pair ofnodes. Defaults toTRUE. WhenFALSE, sampling proceedsas usual, and duplicate edges are removed afterwards. Further,whenFALSE, we assume thatS specifies a desired between-factorconnection probability, and back-transform thisS to theappropriate Poisson intensity parameter to approximate Bernoullifactor connection probabilities. See Section 2.3 of Rohe et al. (2017)for some additional details.

allow_self_loops

Logical indicating whether or notnodes should be allowed to form edges with themselves.Defaults toTRUE. WhenFALSE, sampling proceeds allowingself-loops, and these are then removed after the fact.

Value

Anundirected_factor_model S3 class based on a listwith the following elements:

Examples

n <- 100k <- 5X <- matrix(rpois(n = n * k, 1), nrow = n)S <- matrix(runif(n = k * k, 0, .1), nrow = k)ufm <- undirected_factor_model(X, S)ufmufm2 <- undirected_factor_model(X, S, expected_degree = 50)ufm2svds(ufm2)

[8]ページ先頭

©2009-2025 Movatter.jp