Measuring Agreeableness Bias in Multimodal Models

Jaehyuk Lim; Bruce W. Lee

Measuring Agreeableness Bias in Multimodal Models

Jaehyuk Lim, Bruce W. Lee

TL;DR

The findings reveal a significant shift in the models' responses towards the pre-marked option, even when it contradicts their answers in the neutral settings, raising important questions about their application in critical decision-making contexts where such visual cues might be present.

Abstract

This paper examines a phenomenon in multimodal language models where pre-marked options in question images can significantly influence model responses. Our study employs a systematic methodology to investigate this effect: we present models with images of multiple-choice questions, which they initially answer correctly, then expose the same model to versions with pre-marked options. Our findings reveal a significant shift in the models' responses towards the pre-marked option, even when it contradicts their answers in the neutral settings. Comprehensive evaluations demonstrate that this agreeableness bias is a consistent and quantifiable behavior across various model architectures. These results show potential limitations in the reliability of these models when processing images with pre-marked options, raising important questions about their application in critical decision-making contexts where such visual cues might be present.

Measuring Agreeableness Bias in Multimodal Models

TL;DR

Abstract

Paper Structure (7 sections, 6 figures, 2 tables)

This paper contains 7 sections, 6 figures, 2 tables.

Introduction
Method
Result
Analysis
Related Work
Limitations and Future Work
Conclusion

Figures (6)

Figure 1: A sample of HTML-rendered vMMLU prompt, neutral
Figure 2: A sample of HTML-rendered vMMLU prompt, option C bias
Figure 3: A sample of HTML-rendered vSocialIQa prompt, neutral
Figure 4: A sample of HTML-rendered vSocialIQa prompt, option B bias
Figure 5: Average change in linear probability between neutral and biased prompts for vMMLU (top row) and vSocialIQa (bottom row). The left column represents highlight bias. The top right plot displays size bias, and the bottom right plot shows highlight bias in a typical webpage format, where black text is highlighted in light blue. The type of bias strongly correlates with increased token probability for the corresponding answer choice.
...and 1 more figures

Measuring Agreeableness Bias in Multimodal Models

TL;DR

Abstract

Measuring Agreeableness Bias in Multimodal Models

Authors

TL;DR

Abstract

Table of Contents

Figures (6)