- Notifications
You must be signed in to change notification settings - Fork0
sktsherlock/MAGB
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
MAGB: A Comprehensive Benchmark for Multimodal Attributed Graphs
In many real-world scenarios, graph nodes are associated with multimodal attributes, such as texts and images, resulting inMultimodal Attributed Graphs (MAGs).
MAGB first provide 5 dataset from E-Commerce and Social Networks. And we evaluate two major paradigms:GNN-as Predictor andVLM-as-Predictor . The datasets are publicly available:
🤗Hugging Face | 📑Paper
Multimodal attributed graphs (MAGs) incorporate multiple data types (e.g., text, images, numerical features) into graph structures, enabling more powerful learning and inference capabilities.
This benchmark provides:
✅Standardized datasets with multimodal attributes.
✅Feature extraction pipelines for different modalities.
✅Evaluation metrics to compare different models.
✅Baselines and benchmarks to accelerate research.
Ensure you have the required dependencies installed before running the benchmark.
# Clone the repositorygit clone https://github.com/sktsherlock/MAGB.gitcd MAGB# Install dependenciespip install -r requirements.txt
1. Download the datasets fromMAGB. 👐
cd Data/sudo apt-get update&& sudo apt-get install git-lfs&& git clone https://huggingface.co/datasets/Sherirto/MAGB.ls
Now, you can see theMovies,Toys,Grocery,Reddit-S andReddit-M under the''Data'' folder.
Each dataset consists of several parts shown in the image below, including:
- Graph Data (*.pt): Stores the graph structure, including adjacency information and node labels. It can be loaded using DGL.
- Node Textual Metadata (*.csv): Contains node textual descriptions, neighborhood relationships, and category labels.
- Text, Image, and Multimodal Features (TextFeature/, ImageFeature/, MMFeature/): Pre-extracted embeddings from the MAGB paper for different modalities.
- Raw Images (*.tar.gz): A compressed folder containing images named by node IDs. It needs to be extracted before use.
Because of the Reddit-M dataset is too large, you may need to follow the below scripts to unzip the dataset.
cd MAGB/Data/cat RedditMImages_parta RedditMImages_partb RedditMImages_partc> RedditMImages.tar.gztar -xvzf RedditMImages.tar.gz
In this section, we demonstrate the execution code for both GNN-as-Predictor and VLM-as-Predictor.
In theGNN/Library
directory, we provide the code for models evaluated in the paper, includingGCN, GraphSAGE, GAT, RevGAT
,andMLP
. Additionally, we have added graph learning models such asAPPNP
,SGC
,Node2Vec
, andDeepWalk
for your use. Below, we show the code for node classification usingGCN
on the Movies dataset in two scenarios: 3-shot learning and supervised learning.
pythonGNN/Library/GCN.py--graph_pathData/Movies/MoviesGraph.pt--featureData/Movies/TextFeature/Movies_roberta_base_512_mean.npy--fewshots3
pythonGNN/Library/GCN.py--graph_pathData/Movies/MoviesGraph.pt--featureData/Movies/TextFeature/Movies_roberta_base_512_mean.npy--train_ratio0.6--val_ratio0.2
Note: The fileMovies_roberta_base_512_mean.npy
contains the textual features of the Movies dataset extracted using the RoBERTa-Base model.512
indicates the maximum text length used, andmean
indicates that mean pooling was applied to extract the features. You can use the features we provide or extract your own.
Similarly, you can replace GCN.py with the corresponding code for other models, such asGraphSAGE.py
,GAT.py
, etc. For all node classification training code, it is necessary to pass the graph data path and the corresponding feature file. Other basic parameters can be found in theGNN/Utils/model_config.py
file.
Below are the key parameters related to model training, along with their default values and descriptions:
Parameter | Type | Default Value | Description |
---|---|---|---|
--n-runs | int | 3 | Number of runs for averaging results. |
--lr | float | 0.005 | Learning rate for model optimization. |
--n-epochs | int | 1000 | Total number of training epochs. |
--n-layers | int | 3 | Number of layers in the model. |
--n-hidden | int | 256 | Number of hidden units per layer. |
--dropout | float | 0.5 | Dropout rate to prevent overfitting. |
--label-smoothing | float | 0.1 | Smoothing factor for label smoothing to reduce overfitting. |
--train_ratio | float | 0.6 | Proportion of the dataset used for training. |
--val_ratio | float | 0.2 | Proportion of the dataset used for validation. |
--fewshots | int | None | Number of samples for few-shot learning. |
--metric | str | 'accuracy' | Evaluation metric (e.g., accuracy, precision, recall, f1). |
--average | str | 'macro' | Averaging method (e.g., weighted, micro, macro). |
--graph_path | str | None | Path to the graph dataset file (e.g.,.pt file). |
--feature | str | None | Specifies the unimodal feature embedding to use as input. |
--undirected | bool | True | Whether to treat the graph as undirected. |
--selfloop | bool | True | Whether to add self-loops to the graph. |
Note: Some models may have their own unique parameters, such as 'edge-drop' forRevGAT
andGAT
. For these parameters, please refer to the respective code for details.
In theGNN/LinkPrediction
directory, we provide the code for link prediction experiments using three backbone models:GCN
,GraphSAGE
, andMLP
. Below, we demonstrate the code for running link prediction usingGCN
on theMovies
dataset. The parameters forGraphSAGE
andMLP
are similar, and you can replaceGCN.py
withSAGE.py
orMLP.py
to run experiments with those models.
pythonGNN/LinkPrediction/GCN.py \--n-hidden256 \--n-layers3 \--n-runs5 \--lr0.001 \--neg_len5000 \--dropout0.2 \--batch_size2048 \--graph_pathData/Movies/MoviesGraph.pt \--featureData/Movies/TextFeature/Movies_Llama_3.2_1B_Instruct_512_mean.npy \--link_pathData/LinkPrediction/Movies/
Below are the unique parameters specifically used for link prediction tasks:
Parameter | Type | Default Value | Description |
---|---|---|---|
--neg_len | int | 5000 | Number of negative samples used for training. |
--batch_size | int | 2048 | Batch size for training. |
--link_path | str | None | Path to the directory containing link prediction data (e.g., positive and negative edges). |
These parameters are critical for handling the unique requirements of link prediction tasks, such as generating and managing negative samples, processing large datasets efficiently, and specifying the location of link prediction data.
TheMLLM/Zero-shot.py
script is designed for zero-shot node classification tasks using multimodal large language models (MLLMs). Below are the key command-line arguments for this script:
Parameter | Type | Default Value | Description |
---|---|---|---|
--model_name | str | 'meta-llama/Llama-3.2-11B-Vision-Instruct' | HuggingFace model name or path. |
--dataset_name | str | 'Movies' | Name of the dataset (corresponds to a subdirectory in theData folder). |
--base_dir | str | Project root directory | Path to the root directory of the project. |
--max_new_tokens | int | 15 | Maximum number of tokens to generate. |
--neighbor_mode | str | 'both' | Mode for using neighbor information (text ,image , orboth ). |
--use_center_text | str | 'True' | Whether to use the center node's text. |
--use_center_image | str | 'True' | Whether to use the center node's image. |
--add_CoT | str | 'False' | Whether to add Chain of Thought (CoT) reasoning. |
--num_samples | int | 5 | Number of test samples to evaluate. |
--num_neighbours | int | 0 | Number of neighbors to consider for each node. |
Below, we present the code for performing zero-shot node classification on theMovies
dataset using theLLaMA-3.2-11B Vision Instruct
model with different strategies. This is provided to help researchers reproduce the experimental results presented in our paper.
$\text{Center-only}$
pythonMLLM/Zero-shot.py--model_namemeta-llama/Llama-3.2-11B-Vision-Instruct--num_samples300--max_new_tokens30--dataset_nameMoives
$\text{GRE-T}_{k=1}$
pythonMLLM/Zero-shot.py--model_namemeta-llama/Llama-3.2-11B-Vision-Instruct--num_neighbours1--neighbor_modetext--num_samples300--max_new_tokens30--dataset_nameMoives
$\text{GRE-V}_{k=1}$
pythonMLLM/Zero-shot.py--model_namemeta-llama/Llama-3.2-11B-Vision-Instruct--num_neighbours1--neighbor_modeimage--num_samples300--max_new_tokens30--dataset_nameMoives
$\text{GRE-M}_{k=1}$
pythonMLLM/Zero-shot.py--model_namemeta-llama/Llama-3.2-11B-Vision-Instruct--num_neighbours1--neighbor_modeboth--num_samples300--max_new_tokens30--dataset_nameMoives
Please note that both the VLMs and GNNs used the same original test set for the node classification task. However, for efficiency during VLM testing, we randomly selected 300 samples from this original test set.We observed that the experimental results obtained on this subset did not deviate significantly from those obtained on the complete test set.
Theload_model_and_processor
function inMLLM/Library.py
is designed to load specific models and their corresponding processors from the Hugging Face library. If you want to use a model that is not currently supported, you can modify this function to include your custom model. Below is an example to guide you through the process.
Suppose you want to add support for a new model,custom-org/custom-model-7B
, which uses theAutoModelForCausalLM
class andAutoProcessor
. Here's how you can modify theload_model_and_processor
function:
- Open the
MLLM/Library.py
file. - Locate the
model_mapping
dictionary inside theload_model_and_processor
function. - Add a new entry for your custom model.
Here is the modified code:
defload_model_and_processor(model_name:str):""" Load the model and processor based on the Hugging Face model name. """model_mapping= {"meta-llama/Llama-3.2-11B-Vision-Instruct": {"model_cls":MllamaForConditionalGeneration,"processor_cls":AutoProcessor, },"custom-org/custom-model-7B": {# Add your custom model here"model_cls":AutoModelForCausalLM,# Replace with the correct model class"processor_cls":AutoProcessor,# Replace with the correct processor class },# Other existing models... }# Other existing codes...returnmodel,processor
We welcome contributions toMAGB. To contribute:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Submit a pull request with a detailed description of your changes.
For major changes, please open an issue first to discuss what you would like to change.
If you use MAGB in your research, please cite our paper:
@misc{yan2025graphmeetsmultimodalbenchmarking,title={When Graph meets Multimodal: Benchmarking and Meditating on Multimodal Attributed Graphs Learning},author={Hao Yan and Chaozhuo Li and Jun Yin and Zhigang Yu and Weihao Han and Mingzheng Li and Zhengxin Zeng and Hao Sun and Senzhang Wang},year={2025},eprint={2410.09132},archivePrefix={arXiv},url={https://arxiv.org/abs/2410.09132},}
About
Benchmarking for the attributed graphs
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.