- Notifications
You must be signed in to change notification settings - Fork2
Graph (network) embeddings evaluation framework via classification, gram martix construction for links prediction
License
eXascaleInfolab/GraphEmbEval
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
Graph (network) embeddings evaluation sript via the classification, gram martix construction for links prediction. It has been designed for the comprehensive evaluation of theDAOR graph embedding framework.
This is a significantly modified and extended version of the Python scoring script from theDeepWalk. The extensions include classification using not only the linear regression but also various SVM/SVC kernels, some preprocessing and optimizations implemented using Cython.
Authors (in addition to the authors of the originalDeepWalk): (c) Artem Lutovartem@exascale.info, Dingqi Yang
The paper:
@inproceedings{Daor19,author={Artem Lutov and Dingqi Yang and Philippe Cudr{\'e}-Mauroux},title={Bridging the Gap between Community and Node Representations: Graph Embedding via Community Detection},year={2019},keywords={parameter-free graph embedding, unsupervisedlearning of network representation, automatic feature extraction,interpretable compact embeddings, scalable graph embedding},}$ ./install_reqs.sh$ ./build.sh$time python3 scoring_classif.py --embeddings embeds/blog.nvc -m cosine -o res/blog.reseval -s liblinear --num-shuffles 3 --network graphs/blog.mat
General Options:
$ python3 scoring_classif.py -husage: scoring_classif.py [-h] [-w] [--no-dissim] [--root-dims] [--dim-vmin DIM_VMIN] [-m METRIC] [-b] [-o OUTPUT] [--num-shuffles NUM_SHUFFLES] [-p] [--no-cython] {eval,gram,test} ...Network embedding evaluation using multi-lable classification.optional arguments: -h, --help show this help message and exit -w, --weighted-dims Apply dimension weights if specified (applicable only for the NVC format). (default: False) --no-dissim Omit dissimilarity weighting (if weights are specified at all). (default: False) --root-dims Use only root (top) level dimensions (clusers), actual only for the NVC format. (default: False) --dim-vmin DIM_VMIN Minimal dimension value to be processed before the weighting, [0, 1). (default: 0) -m METRIC, --metric METRIC Applied metric for the similarity matrics construction: cosine, jaccard, hamming. (default: cosine) -b, --binarize Binarize the embedding minimizing the Mean Square Error. NOTE: the median binarizaion is performed if the hamming metric is specified with this flag. (default: False) -o OUTPUT, --output OUTPUT A file name for the results. Default: ./<embeds>.res or ./gtam_<embeds>.mat. (default: None) --num-shuffles NUM_SHUFFLES Number of shuffles of the embedding matrix, >= 1. (default: 5) -p, --profile Profile the application execution. (default: False) --no-cython Disable optimized routines from the Cython libs. (default: False)Embedding processing modes: {eval,gram,test} eval Evaluate embedding. gram Produce Gram (network nodes similarity) matrix. test Run doc tests for all modules including "similarities".Evaluation Options:
$ python3 scoring_classif.py eval -husage: scoring_classif.py eval [-h] -e EMBEDDING -n NETWORK [--adj-matrix-name ADJ_MATRIX_NAME] [--label-matrix-name LABEL_MATRIX_NAME] [-s SOLVER] [-k KERNEL] [--balance-classes] [--all] [--num-shuffles NUM_SHUFFLES] [--accuracy-detailed ACCURACY_DETAILED]optional arguments: -h, --help show this help message and exit -e EMBEDDING, --embedding EMBEDDING File name of the embedding in .mat, .nvc or .csv/.ssv (text) format. -n NETWORK, --network NETWORK An input network (graph): a .mat file containing the adjacency matrix and node labels. --adj-matrix-name ADJ_MATRIX_NAME Variable name of the adjacency matrix inside the network .mat file. --label-matrix-name LABEL_MATRIX_NAME Variable name of the labels matrix inside the network .mat file. -s SOLVER, --solver SOLVER Linear Regression solver: liblinear (fastest), lbfgs (less accurate, slower, parallel). ATTENTION: has priority over the SVM kernel. -k KERNEL, --kernel KERNEL SVM kernel: precomputed (fast but requires gram/similarity matrix), rbf (accurate, slow), linear (slow). --balance-classes Balance (weight) the grouund-truth classes by their size. --all The embedding is evaluated on all training percents from 10 to 90 when this flag is set to true. By default, only training percents of 0.3, 0.5, 0.7 are used. --num-shuffles NUM_SHUFFLES Number of shuffles of the embedding matrix, >= 1. --accuracy-detailed ACCURACY_DETAILED Output also detailed accuracy evalaution results to ./acr_<evalres>.mat.- DAOR - Parameter-free Embedding Framework for Large Graphs (Networks) based onDAOC unsupervised and parameter-free community detection.
- NodeSketch - Highly-Efficient Graph Embeddings via Recursive Sketching
- HARP - Hierarchical Representation Learning for Networks
- NetHash - Efficient Attributed Network Embedding via Recursive Randomized Hashing
- Deepwalk - Online Deep Learning of Social Representations on Graphs
- Clubmark - A parallel isolation framework for benchmarking and profiling clustering (community detection) algorithms considering overlaps (covers), includes a dozen of clustering algorithms for large networks.
- PyExPool - multiprocess execution pool and load balancer, which provides [external] applications scheduling for the in-RAM execution on NUMA architecture with capabilities of the affinity control, CPU cache vs parallelization maximization, memory consumption and execution time constrains specification for the whole execution pool and per each executor process (called worker, executes a job).
Note: Please,star this project if you use it.
About
Graph (network) embeddings evaluation framework via classification, gram martix construction for links prediction
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.