Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Benchmarking for the attributed graphs

NotificationsYou must be signed in to change notification settings

sktsherlock/MAGB

Repository files navigation

MAGB: A Comprehensive Benchmark for Multimodal Attributed Graphs

In many real-world scenarios, graph nodes are associated with multimodal attributes, such as texts and images, resulting inMultimodal Attributed Graphs (MAGs).

MAGB first provide 5 dataset from E-Commerce and Social Networks. And we evaluate two major paradigms:GNN-as Predictor andVLM-as-Predictor . The datasets are publicly available:

🤗Hugging Face   |   📑Paper  

📖 Table of Contents


📖 Introduction

Multimodal attributed graphs (MAGs) incorporate multiple data types (e.g., text, images, numerical features) into graph structures, enabling more powerful learning and inference capabilities.
This benchmark provides:
Standardized datasets with multimodal attributes.
Feature extraction pipelines for different modalities.
Evaluation metrics to compare different models.
Baselines and benchmarks to accelerate research.


💻 Installation

Ensure you have the required dependencies installed before running the benchmark.

# Clone the repositorygit clone https://github.com/sktsherlock/MAGB.gitcd MAGB# Install dependenciespip install -r requirements.txt

🚀 Usage

1. Download the datasets fromMAGB. 👐

cd Data/sudo apt-get update&& sudo apt-get install git-lfs&& git clone https://huggingface.co/datasets/Sherirto/MAGB.ls

Now, you can see theMovies,Toys,Grocery,Reddit-S andReddit-M under the''Data'' folder.

Each dataset consists of several parts shown in the image below, including:

  • Graph Data (*.pt): Stores the graph structure, including adjacency information and node labels. It can be loaded using DGL.
  • Node Textual Metadata (*.csv): Contains node textual descriptions, neighborhood relationships, and category labels.
  • Text, Image, and Multimodal Features (TextFeature/, ImageFeature/, MMFeature/): Pre-extracted embeddings from the MAGB paper for different modalities.
  • Raw Images (*.tar.gz): A compressed folder containing images named by node IDs. It needs to be extracted before use.

Because of the Reddit-M dataset is too large, you may need to follow the below scripts to unzip the dataset.

cd MAGB/Data/cat RedditMImages_parta RedditMImages_partb RedditMImages_partc> RedditMImages.tar.gztar -xvzf RedditMImages.tar.gz

2. GNN-as-Predictor

About

Benchmarking for the attributed graphs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp