Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Official repository of the paper "Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models"

License

NotificationsYou must be signed in to change notification settings

CoderChen01/LVLMSarcasmAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


arXiv

With the advent of large vision-language models (LVLMs) demonstrating increasingly human-like abilities, a pivotal question emerges: do different LVLMs interpret multimodal sarcasm differently, and can a single model grasp sarcasm from multiple perspectives like humans? To explore this, we introduce an analytical framework using systematically designed prompts on existing multimodal sarcasm datasets. Evaluating 12 state-of-the-art LVLMs over 2,409 samples, we examine interpretive variations within and across models, focusing on confidence levels, alignment with dataset labels, and recognition of ambiguous ``neutral'' cases. Our findings reveal notable discrepancies---across LVLMs and within the same model under varied prompts. While classification-oriented prompts yield higher internal consistency, models diverge markedly when tasked with interpretive reasoning. These results challenge binary labeling paradigms by highlighting sarcasm’s subjectivity. We advocate moving beyond rigid annotation schemes toward multi-perspective, uncertainty-aware modeling, offering deeper insights into multimodal sarcasm comprehension.

framework-overview


ℹ️ Installation

poetry install

If you don't installpipx andpoetry yet, recommend to install them first.

python3 -m pip install --user pipxpython3 -m pipx ensurepath

Then, installpoetry viapipx.

pipx install poetry

You also can follow the official installation guide:https://python-poetry.org/docs/#installation.

🕹 Evaluation

For each model evaluation, perform the following operations.

🤖 Start the OpenAI Compatable Server for the specific model

vllm serve<hf-model-id>  --task generate --trust-remote-code  --limit-mm-per-prompt image=1

🪄 Run the evaluation script

lvlm-sarc-evaluator --dataset-path<path> --dataset-name<name [optional]> --dataset-split<split-name [optional]>  --output-path<output-path>  --config-file-path<config-path>  vllm --model<hf-model-id>   --num-proc<num-proc>

Config File

We introduced configuration files to control the behavior during the operation oflvlm-sarc-evaluator to improve fault tolerance, such as dynamically configuring the api_url and api_key of each model, and dynamically starting and pausing evaluation requests.examples/evaluator_config.json is an example of this configuration file, you can use it directly by specifying it directly through--config-path.

📉 Analysis

After performing evaluation for each model, we can get the final dataset and execute the following instructions to reproduce the results in our paper.

Inter-Prompt Consistency Analysis

lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A inter_prompt

Agreement with Ground Truth Analysis

lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A agreement_gt

Model Confidence Analysis

lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A model_nll

Neutral Label Aalysis

lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A neutral_label

Config File

lvlm-sarc-analyzer will automatically draw the data graph. However, due to the long model name, the chart layout is not good, so we introduced a configuration file to configure short name for the model name to facilitate better layout of the data graph.examples/analyzer_config.json is an example of this configuration file, you can use it directly by specifying it directly through--config-path.

About

Official repository of the paper "Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp