Movatterモバイル変換


[0]ホーム

URL:


Using sparta

A probability mass function can be represented by a multi-dimensionalarray. However, for high-dimensional distributions where each variablemay have a large state space, lack of computer memory can become aproblem. For example, an\(80\)-dimensional random vector in whicheach variable has\(10\) levels willlead to a state space with\(10^{80}\)cells. Such a distribution can not be stored in a computer; in fact,\(10^{80}\) is one of the estimates ofthe number of atoms in the universe. However, if the array consists ofonly a few non-zero values, we need only store these values along withinformation about their location. That is, a sparse representation of atable. Sparta was created for efficient multiplication andmarginalization of sparse tables.

How to use sparta

library(sparta)

Consider two arraysf andg:

dn<-function(x)setNames(lapply(x, paste0,1:2),toupper(x))d<-c(2,2,2)f<-array(c(5,4,0,7,0,9,0,0), d,dn(c("x","y","z")))g<-array(c(7,6,0,6,0,0,9,0), d,dn(c("y","z","w")))

with flat layouts

ftable(f,row.vars ="X")#>    Y y1    y2#>    Z z1 z2 z1 z2#> X#> x1    5  0  0  0#> x2    4  9  7  0ftable(g,row.vars ="W")#>    Y y1    y2#>    Z z1 z2 z1 z2#> W#> w1    7  0  6  6#> w2    0  9  0  0

We can convert these to their equivalentspartaversions as

sf<-as_sparta(f); sg<-as_sparta(g)

Printing the object by the default printing method yields

print.default(sf)#>   [,1] [,2] [,3] [,4]#> X    1    2    2    2#> Y    1    1    2    1#> Z    1    1    1    2#> attr(,"vals")#> [1] 5 4 7 9#> attr(,"dim_names")#> attr(,"dim_names")$X#> [1] "x1" "x2"#>#> attr(,"dim_names")$Y#> [1] "y1" "y2"#>#> attr(,"dim_names")$Z#> [1] "z1" "z2"#>#> attr(,"class")#> [1] "sparta" "matrix"

The columns are the cells in the sparse matrix and thevals attribute are the corresponding values which can beextracted with thevals function. Furthermore, the domainresides in thedim_names attribute, which can also beextracted using thedim_names function. From the output, wesee that (x2,y2,z1) has a valueof\(2\). Using thesparta print method prettifies things:

print(sf)#>   X Y Z val#> 1 1 1 1   5#> 2 2 1 1   4#> 3 2 2 1   7#> 4 2 1 2   9

where row\(i\) corresponds tocolumn\(i\) in the sparse matrix. Theproduct ofsf andsg

mfg<-mult(sf, sg); mfg#>   X Y Z W val#> 1 2 1 2 2  81#> 2 2 2 1 1  42#> 3 1 1 1 1  35#> 4 2 1 1 1  28

Convertingsf into a conditional probability table (CPT)with conditioning variableZ:

sf_cpt<-as_cpt(sf,y ="Z"); sf_cpt#>   X Y Z   val#> 1 1 1 1 0.312#> 2 2 1 1 0.250#> 3 2 2 1 0.438#> 4 2 1 2 1.000

Slicingsf onX1 = x1 and dropping theX dimension

slice(sf,s =c(X ="x1"),drop =TRUE)#>   Y Z val#> 1 1 1   5

reducessf to a single non-zero element, whereas theequivalent dense case would result in a(Y,Z) table withone non-zero element and three zero-elements.

Marginalizing (or summing) outY insgyields

marg(sg,y =c("Y"))#>   Z W val#> 1 2 2   9#> 2 2 1   6#> 3 1 1  13

Finally, we mention that a sparse table can be created using theconstructorsparta_struct, which can be necessary to use ifthe corresponding dense table is too large to have in memory.

Functionalities in sparta

Function nameDescription
as_<sparta>Convert -like object to asparta
as_<array/df/cpt>Convertsparta object to anarray/data.frame/CPT
sparta_structConstructor forsparta objects
mult, div, marg, sliceMultiply/divide/marginalize/slice
normalizeNormalize (the values of the result sum to one)
get_valExtract the value for a specific named cell
get_cell_nameExtract the named cell
get_valuesExtract the values
dim_namesExtract the domain
namesExtract the variable names
max/minThe maximum/minimum value
which_<max/min>_cellThe column index referring to the max/min value
which_<max/min>_idxThe configuration corresponding to the max/minvalue
sumSum the values
equivTest if two tables are identical up to permutations ofthe columns

[8]ページ先頭

©2009-2025 Movatter.jp