As an AI Language Model, "Yes I Would Recommend Calling the Police": Norm Inconsistency in LLM Decision-Making
Shomik Jain, D Calacci, Ashia Wilson
TL;DR
This work investigates norm inconsistency in large language models when making normative decisions about police intervention in Ring Neighbors surveillance videos. Using three state-of-the-art models (GPT-4, Gemini 1.0, Claude 3 Sonnet) on 928 real videos, the study demonstrates misalignment between factual crime content and the decision to call the police, and reveals biases linked to neighborhood demographics. Through regression analyses and analysis of response types and framing, the authors show substantial cross-model disagreement and model-specific patterns in how activity types and context influence normative judgments. The findings highlight the instability and opacity of normative decisions in high-stakes surveillance contexts, raise questions about bias mitigation, and argue for transparent, measurement-driven evaluation of normative behavior in foundation models. The work has practical implications for safety, policy, and the design of responsible AI systems in surveillance domains.
Abstract
We investigate the phenomenon of norm inconsistency: where LLMs apply different norms in similar situations. Specifically, we focus on the high-risk application of deciding whether to call the police in Amazon Ring home surveillance videos. We evaluate the decisions of three state-of-the-art LLMs -- GPT-4, Gemini 1.0, and Claude 3 Sonnet -- in relation to the activities portrayed in the videos, the subjects' skin-tone and gender, and the characteristics of the neighborhoods where the videos were recorded. Our analysis reveals significant norm inconsistencies: (1) a discordance between the recommendation to call the police and the actual presence of criminal activity, and (2) biases influenced by the racial demographics of the neighborhoods. These results highlight the arbitrariness of model decisions in the surveillance context and the limitations of current bias detection and mitigation strategies in normative decision-making.
