- Notifications
You must be signed in to change notification settings - Fork0
Official repository of the paper "Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models"
License
CoderChen01/LVLMSarcasmAnalysis
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
🎉 Seeing Sarcasm Through Different Eyes: Analyzing Multimodal Sarcasm Perception in Large Vision-Language Models
With the advent of large vision-language models (LVLMs) demonstrating increasingly human-like abilities, a pivotal question emerges: do different LVLMs interpret multimodal sarcasm differently, and can a single model grasp sarcasm from multiple perspectives like humans? To explore this, we introduce an analytical framework using systematically designed prompts on existing multimodal sarcasm datasets. Evaluating 12 state-of-the-art LVLMs over 2,409 samples, we examine interpretive variations within and across models, focusing on confidence levels, alignment with dataset labels, and recognition of ambiguous ``neutral'' cases. Our findings reveal notable discrepancies---across LVLMs and within the same model under varied prompts. While classification-oriented prompts yield higher internal consistency, models diverge markedly when tasked with interpretive reasoning. These results challenge binary labeling paradigms by highlighting sarcasm’s subjectivity. We advocate moving beyond rigid annotation schemes toward multi-perspective, uncertainty-aware modeling, offering deeper insights into multimodal sarcasm comprehension.
poetry install
If you don't install
pipx
andpoetry
yet, recommend to install them first.python3 -m pip install --user pipxpython3 -m pipx ensurepathThen, install
poetry
viapipx
.pipx install poetryYou also can follow the official installation guide:https://python-poetry.org/docs/#installation.
For each model evaluation, perform the following operations.
vllm serve<hf-model-id> --task generate --trust-remote-code --limit-mm-per-prompt image=1
lvlm-sarc-evaluator --dataset-path<path> --dataset-name<name [optional]> --dataset-split<split-name [optional]> --output-path<output-path> --config-file-path<config-path> vllm --model<hf-model-id> --num-proc<num-proc>
Config File
We introduced configuration files to control the behavior during the operation of
lvlm-sarc-evaluator
to improve fault tolerance, such as dynamically configuring the api_url and api_key of each model, and dynamically starting and pausing evaluation requests.examples/evaluator_config.json is an example of this configuration file, you can use it directly by specifying it directly through--config-path
.
After performing evaluation for each model, we can get the final dataset and execute the following instructions to reproduce the results in our paper.
lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A inter_prompt
lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A agreement_gt
lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A model_nll
lvlm-sarc-analyzer --data-path<final-dataset-path> --output-path<result-output-path> --config-path<config-path> -A neutral_label
Config File
lvlm-sarc-analyzer
will automatically draw the data graph. However, due to the long model name, the chart layout is not good, so we introduced a configuration file to configure short name for the model name to facilitate better layout of the data graph.examples/analyzer_config.json is an example of this configuration file, you can use it directly by specifying it directly through--config-path
.