Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Graph (network) embeddings evaluation framework via classification, gram martix construction for links prediction

License

NotificationsYou must be signed in to change notification settings

eXascaleInfolab/GraphEmbEval

Repository files navigation

Graph (network) embeddings evaluation sript via the classification, gram martix construction for links prediction. It has been designed for the comprehensive evaluation of theDAOR graph embedding framework.
This is a significantly modified and extended version of the Python scoring script from theDeepWalk. The extensions include classification using not only the linear regression but also various SVM/SVC kernels, some preprocessing and optimizations implemented using Cython.

Authors (in addition to the authors of the originalDeepWalk): (c) Artem Lutovartem@exascale.info, Dingqi Yang

The paper:

@inproceedings{Daor19,author={Artem Lutov and Dingqi Yang and Philippe Cudr{\'e}-Mauroux},title={Bridging the Gap between Community and Node Representations: Graph Embedding via Community Detection},year={2019},keywords={parameter-free graph embedding, unsupervisedlearning of network representation, automatic feature extraction,interpretable compact embeddings, scalable graph embedding},}

Deployment

$ ./install_reqs.sh$ ./build.sh

Usage Examples:

$time python3 scoring_classif.py --embeddings embeds/blog.nvc -m cosine -o res/blog.reseval -s liblinear --num-shuffles 3 --network graphs/blog.mat

Options

General Options:

$ python3 scoring_classif.py -husage: scoring_classif.py [-h] [-w] [--no-dissim] [--root-dims]                          [--dim-vmin DIM_VMIN] [-m METRIC] [-b] [-o OUTPUT]                          [--num-shuffles NUM_SHUFFLES] [-p] [--no-cython]                          {eval,gram,test} ...Network embedding evaluation using multi-lable classification.optional arguments:  -h, --help            show this help message and exit  -w, --weighted-dims   Apply dimension weights if specified (applicable only                        for the NVC format). (default: False)  --no-dissim           Omit dissimilarity weighting (if weights are specified                        at all). (default: False)  --root-dims           Use only root (top) level dimensions (clusers), actual                        only for the NVC format. (default: False)  --dim-vmin DIM_VMIN   Minimal dimension value to be processed before the                        weighting, [0, 1). (default: 0)  -m METRIC, --metric METRIC                        Applied metric for the similarity matrics                        construction: cosine, jaccard, hamming. (default:                        cosine)  -b, --binarize        Binarize the embedding minimizing the Mean Square                        Error. NOTE: the median binarizaion is performed if                        the hamming metric is specified with this flag.                        (default: False)  -o OUTPUT, --output OUTPUT                        A file name for the results. Default: ./<embeds>.res                        or ./gtam_<embeds>.mat. (default: None)  --num-shuffles NUM_SHUFFLES                        Number of shuffles of the embedding matrix, >= 1.                        (default: 5)  -p, --profile         Profile the application execution. (default: False)  --no-cython           Disable optimized routines from the Cython libs.                        (default: False)Embedding processing modes:  {eval,gram,test}    eval                Evaluate embedding.    gram                Produce Gram (network nodes similarity) matrix.    test                Run doc tests for all modules including                        "similarities".

Evaluation Options:

$ python3 scoring_classif.py eval -husage: scoring_classif.py eval [-h] -e EMBEDDING -n NETWORK                               [--adj-matrix-name ADJ_MATRIX_NAME]                               [--label-matrix-name LABEL_MATRIX_NAME]                               [-s SOLVER] [-k KERNEL] [--balance-classes]                               [--all] [--num-shuffles NUM_SHUFFLES]                               [--accuracy-detailed ACCURACY_DETAILED]optional arguments:  -h, --help            show this help message and exit  -e EMBEDDING, --embedding EMBEDDING                        File name of the embedding in .mat, .nvc or .csv/.ssv                        (text) format.  -n NETWORK, --network NETWORK                        An input network (graph): a .mat file containing the                        adjacency matrix and node labels.  --adj-matrix-name ADJ_MATRIX_NAME                        Variable name of the adjacency matrix inside the                        network .mat file.  --label-matrix-name LABEL_MATRIX_NAME                        Variable name of the labels matrix inside the network                        .mat file.  -s SOLVER, --solver SOLVER                        Linear Regression solver: liblinear (fastest), lbfgs                        (less accurate, slower, parallel). ATTENTION: has                        priority over the SVM kernel.  -k KERNEL, --kernel KERNEL                        SVM kernel: precomputed (fast but requires                        gram/similarity matrix), rbf (accurate, slow), linear                        (slow).  --balance-classes     Balance (weight) the grouund-truth classes by their                        size.  --all                 The embedding is evaluated on all training percents                        from 10 to 90 when this flag is set to true. By                        default, only training percents of 0.3, 0.5, 0.7 are                        used.  --num-shuffles NUM_SHUFFLES                        Number of shuffles of the embedding matrix, >= 1.  --accuracy-detailed ACCURACY_DETAILED                        Output also detailed accuracy evalaution results to                        ./acr_<evalres>.mat.

Related Projects

  • DAOR - Parameter-free Embedding Framework for Large Graphs (Networks) based onDAOC unsupervised and parameter-free community detection.
  • NodeSketch - Highly-Efficient Graph Embeddings via Recursive Sketching
  • HARP - Hierarchical Representation Learning for Networks
  • NetHash - Efficient Attributed Network Embedding via Recursive Randomized Hashing
  • Deepwalk - Online Deep Learning of Social Representations on Graphs
  • Clubmark - A parallel isolation framework for benchmarking and profiling clustering (community detection) algorithms considering overlaps (covers), includes a dozen of clustering algorithms for large networks.
  • PyExPool - multiprocess execution pool and load balancer, which provides [external] applications scheduling for the in-RAM execution on NUMA architecture with capabilities of the affinity control, CPU cache vs parallelization maximization, memory consumption and execution time constrains specification for the whole execution pool and per each executor process (called worker, executes a job).

Note: Please,star this project if you use it.

About

Graph (network) embeddings evaluation framework via classification, gram martix construction for links prediction

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp