Movatterモバイル変換


[0]ホーム

URL:


Introduction

WebGraph is a framework forgraph compression aimed at studying web graphs. It provides simple ways to managevery large graphs, exploiting modern compression techniques. More precisely,it is currently made of:

  1. A set of flat codes, calledζ codes, which are particularly suitablefor storing web graphs (or, in general, integers with power-law distributionin a certain exponent range). The fact that these codes work well can beeasily tested empirically, but we also try to provide adetailed mathematical analysis.
  2. Algorithms for compressing web graphs that exploitgap compression andreferentiation(à laLINK),intervalisation and ζ codes to provide a high compression ratio (seeour datasets). Thealgorithms are controlled by several parameters, which providedifferent tradeoffs between access speed and compression ratio.
  3. Algorithms for accessing a compressed graph without actually decompressing it,using lazy techniques that delay the decompression until it is actually necessary.
  4. Algorithms for analysing very large graphs, such asHyperBall,whichhas been used to show that Facebook has justfour degrees of separation.
  5. A complete,documented implementation of the algorithms above in Java andRustdistributed under either theGNU Lesser General Public License 2.1+ or theApache Software License 2.0.Besides a clearly defined API,we also provide several classes tha modify (e.g., transpose) orrecompress a graph, so to experiment with various settings.
  6. Datasets for large graph. These are eithergathered from public sources (such asWebBase),or produced byUbiCrawler and BUbiNG.

In the end, with WebGraph you can access and analyse very large web graphs.Using WebGraph is as easy as installing a fewjar files and downloading a dataset. This makes studying phenomena such as PageRank, distribution ofgraph properties of the web graph, etc. very easy.

You are welcome to use and improve WebGraph! If you find our softwareuseful for your research, please quotethispaper.

Hadoop

Helge Holzmann has developed aninput format for Hadoop for graphs inBVGraph format.

WebGraph++

Jacob Ratkievicz has developed aC++ version ofWebGraph that youmight want to try. The library exposes aBVGraphas anobject of theBoost Graph Library, so it iseasily integrable with other code.

pyWebgraph

Massimo Santini has developed afront-end that interfaces Jython with WebGraph.It makes exploring small portions of very large graphs very easy and interactive.

WebGraph for MATLAB®

David Gleich has developed aMATLAB® packageto access WebGraph-encoded data easily.

Download (Rust)

Documentation (Rust)

Download (standard) (Java ≥9)

Documentation (standard)

Download (big) (Java ≥9)

Documentation (big)

Papers

This is valid HTML 4.01


[8]ページ先頭

©2009-2025 Movatter.jp