Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Benchmarking for the attributed graphs

NotificationsYou must be signed in to change notification settings

sktsherlock/MAGB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MAGB: A Comprehensive Benchmark for Multimodal Attributed Graphs

In many real-world scenarios, graph nodes are associated with multimodal attributes, such as texts and images, resulting inMultimodal Attributed Graphs (MAGs).

MAGB first provide 5 dataset from E-Commerce and Social Networks. And we evaluate two major paradigms:GNN-as Predictor andVLM-as-Predictor . The datasets are publicly available:

🤗Hugging Face   |   📑Paper  

📖 Table of Contents


📖 Introduction

Multimodal attributed graphs (MAGs) incorporate multiple data types (e.g., text, images, numerical features) into graph structures, enabling more powerful learning and inference capabilities.
This benchmark provides:
Standardized datasets with multimodal attributes.
Feature extraction pipelines for different modalities.
Evaluation metrics to compare different models.
Baselines and benchmarks to accelerate research.


💻 Installation

Ensure you have the required dependencies installed before running the benchmark.

# Clone the repositorygit clone https://github.com/sktsherlock/MAGB.gitcd MAGB# Install dependenciespip install -r requirements.txt

🚀 Usage

1. Download the datasets fromMAGB. 👐

cd Data/sudo apt-get update&& sudo apt-get install git-lfs&& git clone https://huggingface.co/datasets/Sherirto/MAGB.ls

Now, you can see theMovies,Toys,Grocery,Reddit-S andReddit-M under the''Data'' folder.

Each dataset consists of several parts shown in the image below, including:

  • Graph Data (*.pt): Stores the graph structure, including adjacency information and node labels. It can be loaded using DGL.
  • Node Textual Metadata (*.csv): Contains node textual descriptions, neighborhood relationships, and category labels.
  • Text, Image, and Multimodal Features (TextFeature/, ImageFeature/, MMFeature/): Pre-extracted embeddings from the MAGB paper for different modalities.
  • Raw Images (*.tar.gz): A compressed folder containing images named by node IDs. It needs to be extracted before use.

Because of the Reddit-M dataset is too large, you may need to follow the below scripts to unzip the dataset.

cd MAGB/Data/cat RedditMImages_parta RedditMImages_partb RedditMImages_partc> RedditMImages.tar.gztar -xvzf RedditMImages.tar.gz

2. Experiments

In this section, we demonstrate the execution code for both GNN-as-Predictor and VLM-as-Predictor.

GNN-as-Predictor

🧩 Node Classification

In theGNN/Library directory, we provide the code for models evaluated in the paper, includingGCN, GraphSAGE, GAT, RevGAT,andMLP. Additionally, we have added graph learning models such asAPPNP,SGC,Node2Vec, andDeepWalk for your use. Below, we show the code for node classification usingGCN on the Movies dataset in two scenarios: 3-shot learning and supervised learning.

pythonGNN/Library/GCN.py--graph_pathData/Movies/MoviesGraph.pt--featureData/Movies/TextFeature/Movies_roberta_base_512_mean.npy--fewshots3
pythonGNN/Library/GCN.py--graph_pathData/Movies/MoviesGraph.pt--featureData/Movies/TextFeature/Movies_roberta_base_512_mean.npy--train_ratio0.6--val_ratio0.2

Note: The fileMovies_roberta_base_512_mean.npy contains the textual features of the Movies dataset extracted using the RoBERTa-Base model.512 indicates the maximum text length used, andmean indicates that mean pooling was applied to extract the features. You can use the features we provide or extract your own.

Similarly, you can replace GCN.py with the corresponding code for other models, such asGraphSAGE.py,GAT.py, etc. For all node classification training code, it is necessary to pass the graph data path and the corresponding feature file. Other basic parameters can be found in theGNN/Utils/model_config.py file.

Below are the key parameters related to model training, along with their default values and descriptions:

ParameterTypeDefault ValueDescription
--n-runsint3Number of runs for averaging results.
--lrfloat0.005Learning rate for model optimization.
--n-epochsint1000Total number of training epochs.
--n-layersint3Number of layers in the model.
--n-hiddenint256Number of hidden units per layer.
--dropoutfloat0.5Dropout rate to prevent overfitting.
--label-smoothingfloat0.1Smoothing factor for label smoothing to reduce overfitting.
--train_ratiofloat0.6Proportion of the dataset used for training.
--val_ratiofloat0.2Proportion of the dataset used for validation.
--fewshotsintNoneNumber of samples for few-shot learning.
--metricstr'accuracy'Evaluation metric (e.g., accuracy, precision, recall, f1).
--averagestr'macro'Averaging method (e.g., weighted, micro, macro).
--graph_pathstrNonePath to the graph dataset file (e.g.,.pt file).
--featurestrNoneSpecifies the unimodal feature embedding to use as input.
--undirectedboolTrueWhether to treat the graph as undirected.
--selfloopboolTrueWhether to add self-loops to the graph.

Note: Some models may have their own unique parameters, such as 'edge-drop' forRevGAT andGAT. For these parameters, please refer to the respective code for details.

🔗 Link Prediction

In theGNN/LinkPrediction directory, we provide the code for link prediction experiments using three backbone models:GCN,GraphSAGE, andMLP. Below, we demonstrate the code for running link prediction usingGCN on theMovies dataset. The parameters forGraphSAGE andMLP are similar, and you can replaceGCN.py withSAGE.py orMLP.py to run experiments with those models.

pythonGNN/LinkPrediction/GCN.py \--n-hidden256 \--n-layers3 \--n-runs5 \--lr0.001 \--neg_len5000 \--dropout0.2 \--batch_size2048 \--graph_pathData/Movies/MoviesGraph.pt \--featureData/Movies/TextFeature/Movies_Llama_3.2_1B_Instruct_512_mean.npy \--link_pathData/LinkPrediction/Movies/

Below are the unique parameters specifically used for link prediction tasks:

ParameterTypeDefault ValueDescription
--neg_lenint5000Number of negative samples used for training.
--batch_sizeint2048Batch size for training.
--link_pathstrNonePath to the directory containing link prediction data (e.g., positive and negative edges).

These parameters are critical for handling the unique requirements of link prediction tasks, such as generating and managing negative samples, processing large datasets efficiently, and specifying the location of link prediction data.

VLM-as-Predictor

TheMLLM/Zero-shot.py script is designed for zero-shot node classification tasks using multimodal large language models (MLLMs). Below are the key command-line arguments for this script:

ParameterTypeDefault ValueDescription
--model_namestr'meta-llama/Llama-3.2-11B-Vision-Instruct'HuggingFace model name or path.
--dataset_namestr'Movies'Name of the dataset (corresponds to a subdirectory in theData folder).
--base_dirstrProject root directoryPath to the root directory of the project.
--max_new_tokensint15Maximum number of tokens to generate.
--neighbor_modestr'both'Mode for using neighbor information (text,image, orboth).
--use_center_textstr'True'Whether to use the center node's text.
--use_center_imagestr'True'Whether to use the center node's image.
--add_CoTstr'False'Whether to add Chain of Thought (CoT) reasoning.
--num_samplesint5Number of test samples to evaluate.
--num_neighboursint0Number of neighbors to consider for each node.

Below, we present the code for performing zero-shot node classification on theMovies dataset using theLLaMA-3.2-11B Vision Instruct model with different strategies. This is provided to help researchers reproduce the experimental results presented in our paper.

  1. $\text{Center-only}$
pythonMLLM/Zero-shot.py--model_namemeta-llama/Llama-3.2-11B-Vision-Instruct--num_samples300--max_new_tokens30--dataset_nameMoives
  1. $\text{GRE-T}_{k=1}$
pythonMLLM/Zero-shot.py--model_namemeta-llama/Llama-3.2-11B-Vision-Instruct--num_neighbours1--neighbor_modetext--num_samples300--max_new_tokens30--dataset_nameMoives
  1. $\text{GRE-V}_{k=1}$
pythonMLLM/Zero-shot.py--model_namemeta-llama/Llama-3.2-11B-Vision-Instruct--num_neighbours1--neighbor_modeimage--num_samples300--max_new_tokens30--dataset_nameMoives
  1. $\text{GRE-M}_{k=1}$
pythonMLLM/Zero-shot.py--model_namemeta-llama/Llama-3.2-11B-Vision-Instruct--num_neighbours1--neighbor_modeboth--num_samples300--max_new_tokens30--dataset_nameMoives

Please note that both the VLMs and GNNs used the same original test set for the node classification task. However, for efficiency during VLM testing, we randomly selected 300 samples from this original test set.We observed that the experimental results obtained on this subset did not deviate significantly from those obtained on the complete test set.

🔧 Customizingload_model_and_processor for Unsupported VLMs

Theload_model_and_processor function inMLLM/Library.py is designed to load specific models and their corresponding processors from the Hugging Face library. If you want to use a model that is not currently supported, you can modify this function to include your custom model. Below is an example to guide you through the process.

Example: Adding Support for a Custom Model

Suppose you want to add support for a new model,custom-org/custom-model-7B, which uses theAutoModelForCausalLM class andAutoProcessor. Here's how you can modify theload_model_and_processor function:

  1. Open theMLLM/Library.py file.
  2. Locate themodel_mapping dictionary inside theload_model_and_processor function.
  3. Add a new entry for your custom model.

Here is the modified code:

defload_model_and_processor(model_name:str):"""    Load the model and processor based on the Hugging Face model name.    """model_mapping= {"meta-llama/Llama-3.2-11B-Vision-Instruct": {"model_cls":MllamaForConditionalGeneration,"processor_cls":AutoProcessor,        },"custom-org/custom-model-7B": {# Add your custom model here"model_cls":AutoModelForCausalLM,# Replace with the correct model class"processor_cls":AutoProcessor,# Replace with the correct processor class        },# Other existing models...    }# Other existing codes...returnmodel,processor

🤝 Contributing

We welcome contributions toMAGB. To contribute:

  1. Fork the repository.
  2. Create a new branch for your feature or bug fix.
  3. Submit a pull request with a detailed description of your changes.

For major changes, please open an issue first to discuss what you would like to change.

📚 Citation

If you use MAGB in your research, please cite our paper:

@misc{yan2025graphmeetsmultimodalbenchmarking,title={When Graph meets Multimodal: Benchmarking and Meditating on Multimodal Attributed Graphs Learning},author={Hao Yan and Chaozhuo Li and Jun Yin and Zhigang Yu and Weihao Han and Mingzheng Li and Zhengxin Zeng and Hao Sun and Senzhang Wang},year={2025},eprint={2410.09132},archivePrefix={arXiv},url={https://arxiv.org/abs/2410.09132},}

About

Benchmarking for the attributed graphs

Resources

Stars

Watchers

Forks

Packages

No packages published

[8]ページ先頭

©2009-2025 Movatter.jp