CoderChen01/LVLMSarcasmAnalysisPublic

NotificationsYou must be signed in to change notification settings
Fork0
Star1

Official repository of the paper "Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models"

License

GPL-3.0 license

1 star 0 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.vscode		.vscode
assets		assets
examples		examples
lvlm_sarcasm_analysis		lvlm_sarcasm_analysis
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Repository files navigation

🎉 Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models

With the advent of large vision-language models (LVLMs) demonstrating increasingly human-like abilities, a pivotal question emerges: do different LVLMs interpret multimodal sarcasm differently, and can a single model grasp sarcasm from multiple perspectives like humans? To explore this, we introduce an analytical framework using systematically designed prompts on existing multimodal sarcasm datasets. Evaluating 12 state-of-the-art LVLMs over 2,409 samples, we examine interpretive variations within and across models, focusing on confidence levels, alignment with dataset labels, and recognition of ambiguous ``neutral'' cases. Our findings reveal notable discrepancies---across LVLMs and within the same model under varied prompts. While classification-oriented prompts yield higher internal consistency, models diverge markedly when tasked with interpretive reasoning. These results challenge binary labeling paradigms by highlighting sarcasm’s subjectivity. We advocate moving beyond rigid annotation schemes toward multi-perspective, uncertainty-aware modeling, offering deeper insights into multimodal sarcasm comprehension.

ℹ️ Installation

poetry install

If you don't installpipx andpoetry yet, recommend to install them first.
python3 -m pip install --user pipxpython3 -m pipx ensurepath
Then, installpoetry viapipx.
pipx install poetry
You also can follow the official installation guide:https://python-poetry.org/docs/#installation.

🕹 Evaluation

For each model evaluation, perform the following operations.

🤖 Start the OpenAI Compatable Server for the specific model

vllm serve<hf-model-id>  --task generate --trust-remote-code  --limit-mm-per-prompt image=1

🪄 Run the evaluation script

lvlm-sarc-evaluator --dataset-path<path> --dataset-name<name [optional]> --dataset-split<split-name [optional]>  --output-path<output-path>  --config-file-path<config-path>  vllm --model<hf-model-id>   --num-proc<num-proc>

Config File
We introduced configuration files to control the behavior during the operation oflvlm-sarc-evaluator to improve fault tolerance, such as dynamically configuring the api_url and api_key of each model, and dynamically starting and pausing evaluation requests.examples/evaluator_config.json is an example of this configuration file, you can use it directly by specifying it directly through--config-path.

📉 Analysis

After performing evaluation for each model, we can get the final dataset and execute the following instructions to reproduce the results in our paper.

Inter-Prompt Consistency Analysis

lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A inter_prompt

Agreement with Ground Truth Analysis

lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A agreement_gt

Model Confidence Analysis

lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A model_nll

Neutral Label Aalysis

lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A neutral_label

Config File
lvlm-sarc-analyzer will automatically draw the data graph. However, due to the long model name, the chart layout is not good, so we introduced a configuration file to configure short name for the model name to facilitate better layout of the data graph.examples/analyzer_config.json is an example of this configuration file, you can use it directly by specifying it directly through--config-path.

About

Official repository of the paper "Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models"

Releases

No releases published

Packages

No packages published

Languages

Python100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🎉 Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models

ℹ️ Installation

🕹 Evaluation

🤖 Start the OpenAI Compatable Server for the specific model

🪄 Run the evaluation script

📉 Analysis

Inter-Prompt Consistency Analysis

Agreement with Ground Truth Analysis

Model Confidence Analysis

Neutral Label Aalysis

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Uh oh!

Languages

Movatterモバイル変換

License

CoderChen01/LVLMSarcasmAnalysis

Folders and files

Latest commit

History

Repository files navigation

🎉 Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models

ℹ️ Installation

🕹 Evaluation

🤖 Start the OpenAI Compatable Server for the specific model

🪄 Run the evaluation script

📉 Analysis

Inter-Prompt Consistency Analysis

Agreement with Ground Truth Analysis

Model Confidence Analysis

Neutral Label Aalysis

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Uh oh!

Languages

Packages