Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Google matrix

From Wikipedia, the free encyclopedia
Stochastic matrix representing links between entities
This article has multiple issues. Please helpimprove it or discuss these issues on thetalk page.(Learn how and when to remove these messages)
This articleprovides insufficient context for those unfamiliar with the subject. Please helpimprove the article byproviding more context for the reader.(March 2019) (Learn how and when to remove this message)
This articlehas an unclearcitation style. The references used may be made clearer with a different or consistent style ofcitation andfootnoting.(December 2014) (Learn how and when to remove this message)
(Learn how and when to remove this message)
Fig.1. Google matrix of Wikipedia articles network, written in the bases of PageRank index; fragment of top 200 X 200 matrix elements is shown, total size N=3282257 (from[1])

AGoogle matrix is a particularstochastic matrix that is used byGoogle'sPageRank algorithm. The matrix represents a graph with edges representing links between pages. The PageRank of each page can then be generated iteratively from the Google matrix using thepower method. However, in order for the power method to converge, the matrix must be stochastic,irreducible andaperiodic.

Adjacency matrixA and Markov matrixS

[edit]

In order to generate the Google matrixG, we must first generate anadjacency matrixA which represents the relations between pages or nodes.

Assuming there areN pages, we can fill outA by doing the following:

  1. A matrix elementAi,j{\displaystyle A_{i,j}} is filled with 1 if nodej{\displaystyle j} has a link to nodei{\displaystyle i}, and 0 otherwise; this is the adjacency matrix of links.
  2. A related matrixS corresponding to the transitions in aMarkov chain of given network is constructed fromA by dividing the elements of column "j" by a number ofkj=Σi=1NAi,j{\displaystyle k_{j}=\Sigma _{i=1}^{N}A_{i,j}} wherekj{\displaystyle k_{j}} is the total number of outgoing links from node j to all other nodes. The columns havingzero matrix elements, corresponding to dangling nodes, are replaced by a constant value1/N. Such a procedure adds a link from every sink, dangling statea{\displaystyle a} to every other node.
  3. Now by the construction the sum of all elements in any column of matrixS is equal to unity. In this way the matrixS is mathematically well defined and it belongs to the class of Markov chains and the class of Perron-Frobenius operators. That makesS suitable for thePageRank algorithm.

Construction of Google matrixG

[edit]
Fig.2. Google matrix of Cambridge University network (2006), coarse-grained matrix elements are written in the bases of PageRank index, total size N=212710 is shown (from[1])

Then the final Google matrix G can be expressed viaS as:

Gij=αSij+(1α)1N(1){\displaystyle G_{ij}=\alpha S_{ij}+(1-\alpha ){\frac {1}{N}}\;\;\;\;\;\;\;\;\;\;\;(1)}

By the construction the sum of all non-negative elements inside each matrix column is equal to unity. The numerical coefficientα{\displaystyle \alpha } is known as a damping factor.

UsuallyS is asparse matrix and for modern directed networks it has only about ten nonzero elements in a line or column, thus only about 10N multiplications are needed to multiply a vector by matrix G.[2][3]

Examples of Google matrix

[edit]

An example of the matrixS{\displaystyle S} construction via Eq.(1) within a simple network is given in the articleCheiRank.

For the actual matrix, Google uses a damping factorα{\displaystyle \alpha } around 0.85.[2][3][4] The term(1α){\displaystyle (1-\alpha )} gives a surfer probability to jump randomly on any page. The matrixG{\displaystyle G} belongs to the class ofPerron-Frobenius operators ofMarkov chains.[2] The examples of Google matrix structure are shown in Fig.1 for Wikipedia articles hyperlink network in 2009 at small scale and in Fig.2 for University of Cambridge network in 2006 at large scale.

Spectrum and eigenstates ofG matrix

[edit]
Fig3. The spectrum of eigenvalues of the Google matrix of University of Cambridge from Fig.2 atα=1{\displaystyle \alpha =1}, blue points show eigenvalues of isolated subspaces, red points show eigenvalues of core component (from[5])

For0<α<1{\displaystyle 0<\alpha <1} there is only one maximal eigenvalueλ=1{\displaystyle \lambda =1} with the corresponding right eigenvector which has non-negative elementsPi{\displaystyle P_{i}} which can be viewed as stationary probability distribution.[2] These probabilities ordered by their decreasing values give the PageRank vectorPi{\displaystyle P_{i}} with the PageRankKi{\displaystyle K_{i}} used by Google search to rank webpages. Usually one has for the World Wide Web thatP1/Kβ{\displaystyle P\propto 1/K^{\beta }} withβ0.9{\displaystyle \beta \approx 0.9}. The number of nodes with a given PageRank value scales asNP1/Pν{\displaystyle N_{P}\propto 1/P^{\nu }} with the exponentν=1+1/β2.1{\displaystyle \nu =1+1/\beta \approx 2.1}.[6][7] The left eigenvector atλ=1{\displaystyle \lambda =1} has constant matrix elements. With0<α{\displaystyle 0<\alpha } all eigenvalues move asλiαλi{\displaystyle \lambda _{i}\rightarrow \alpha \lambda _{i}} except the maximal eigenvalueλ=1{\displaystyle \lambda =1}, which remains unchanged.[2] The PageRank vector varies withα{\displaystyle \alpha } but other eigenvectors withλi<1{\displaystyle \lambda _{i}<1} remain unchanged due to their orthogonality to the constant left vector atλ=1{\displaystyle \lambda =1}. The gap betweenλ=1{\displaystyle \lambda =1} and other eigenvalue being1α0.15{\displaystyle 1-\alpha \approx 0.15} gives a rapid convergence of a random initial vector to the PageRank approximately after 50 multiplications onG{\displaystyle G} matrix.

Fig4. Distribution of eigenvaluesλi{\displaystyle \lambda _{i}} of Google matrices in the complex plane atα=1{\displaystyle \alpha =1} for dictionary networks: Roget (A, N=1022), ODLIS (B, N=2909) and FOLDOC (C, N=13356); UK university WWW networks: University of Wales (Cardiff) (D, N=2778), Birmingham City University (E, N=10631), Keele University (Staffordshire) (F, N=11437), Nottingham Trent University (G, N=12660), Liverpool John Moores University (H, N=13578)(data for universities are for 2002) (from[8])

Atα=1{\displaystyle \alpha =1} the matrixG{\displaystyle G}has generally many degenerate eigenvaluesλ=1{\displaystyle \lambda =1}(see e.g. [6][8]). Examples of the eigenvalue spectrum of the Google matrix of various directed networks is shown in Fig.3 from[5] and Fig.4 from.[8]

The Google matrix can be also constructed for the Ulam networks generated by the Ulam method [8] for dynamical maps. The spectral properties of such matrices are discussed in [9,10,11,12,13,15].[5][9] In a number of cases the spectrum is described by the fractal Weyl law [10,12].

Fig5. Distribution of eigenvaluesλ{\displaystyle \lambda } in the complex plane for the Google matrixG{\displaystyle G} of the Linux Kernel version 2.6.32 with matrix sizeN=285509{\displaystyle N=285509} atα=0.85{\displaystyle \alpha =0.85}, unit circle is shown by solid curve (from[9])
Fig.6 Coarse-grained probability distribution for the eigenstates of the Google matrix for Linux Kernel version 2.6.32. The horizontal lines show the first 64 eigenvectors ordered vertically by|λi|{\displaystyle |\lambda _{i}|} (from[9])

The Google matrix can be constructed also for other directed networks, e.g. for the procedure call network of the Linux Kernel software introduced in [15]. In this case the spectrum ofλ{\displaystyle \lambda } is described by the fractal Weyl law with the fractal dimensiond1.3{\displaystyle d\approx 1.3} (see Fig.5 from[9]).Numerical analysis shows that the eigenstates of matrixG{\displaystyle G} are localized (see Fig.6 from[9]).Arnoldi iteration method allows to compute many eigenvalues and eigenvectors for matrices of rather large size [13].[5][9]

Other examples ofG{\displaystyle G} matrix include the Google matrix of brain [17]and business process management [18], see also.[1] Applications of Google matrix analysis toDNA sequences is described in [20]. Such a Google matrix approach allows also to analyze entanglement of cultures via ranking of multilingual Wikipedia articles abouts persons [21]

Historical notes

[edit]

The Google matrix with damping factor was described bySergey Brin andLarry Page in 1998 [22], see also articles on PageRank history [23], [24].

See also

[edit]

References

[edit]
  1. ^abcErmann, L.; Chepelianskii, A. D.; Shepelyansky, D. L. (2011). "Towards two-dimensional search engines".Journal of Physics A.45 (27) 275101.arXiv:1106.6215.Bibcode:2012JPhA...45A5101E.doi:10.1088/1751-8113/45/27/275101.S2CID 14827486.
  2. ^abcdeLangville, Amy N.; Meyer, Carl (2006).Google's PageRank and Beyond.Princeton University Press.ISBN 978-0-691-12202-1.
  3. ^abAustin, David (2008)."How Google Finds Your Needle in the Web's Haystack". AMS Feature Columns. Archived fromthe original on 2018-01-11. Retrieved2011-01-08.
  4. ^Law, Edith (2008-10-09)."PageRank Lecture 12"(PDF).
  5. ^abcdFrahm, K. M.; Georgeot, B.; Shepelyansky, D. L. (2011-11-01). "Universal emergence of PageRank".Journal of Physics A.44 (46) 465101.arXiv:1105.1062.Bibcode:2011JPhA...44T5101F.doi:10.1088/1751-8113/44/46/465101.S2CID 16292743.
  6. ^Donato, Debora; Laura, Luigi; Leonardi, Stefano; Millozzi, Stefano (2004-03-30)."Large scale properties of the Webgraph"(PDF).European Physical Journal B.38 (2):239–243.Bibcode:2004EPJB...38..239D.CiteSeerX 10.1.1.104.2136.doi:10.1140/epjb/e2004-00056-6.S2CID 10640375.
  7. ^Pandurangan, Gopal; Ranghavan, Prabhakar; Upfal, Eli (2005)."Using PageRank to Characterize Web Structure"(PDF).Internet Mathematics.3 (1):1–20.doi:10.1080/15427951.2006.10129114.S2CID 101281.
  8. ^abcGeorgeot, Bertrand; Giraud, Olivier; Shepelyansky, Dima L. (2010-05-25). "Spectral properties of the Google matrix of the World Wide Web and other directed networks".Physical Review E.81 (5) 056109.arXiv:1002.3342.Bibcode:2010PhRvE..81e6109G.doi:10.1103/PhysRevE.81.056109.PMID 20866299.S2CID 14490804.
  9. ^abcdefErmann, L.; Chepelianskii, A. D.; Shepelyansky, D. L. (2011). "Fractal Weyl law for Linux Kernel Architecture".European Physical Journal B.79 (1):115–120.arXiv:1005.1395.Bibcode:2011EPJB...79..115E.doi:10.1140/epjb/e2010-10774-7.S2CID 445348.

External links

[edit]
Features
Component algorithms and updates
Special purpose search engines
Data insights
Developer and business tools
Related
Retrieved from "https://en.wikipedia.org/w/index.php?title=Google_matrix&oldid=1339151507"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp