Table of Contents
Fetching ...

How Inclusively do LMs Perceive Social and Moral Norms?

Michael Galarnyk, Agam Shah, Dipanwita Guhathakurta, Poojitha Nandigam, Sudheer Chava

TL;DR

The paper tackles how inclusively language models perceive social and moral norms across demographic groups, introducing the Absolute Distance Alignment Metric (ADA-Met) to quantify ordinal alignment with human judgments on Social Chemistry 101 RoTs. It benchmarks 11 LMs against 100 human annotators, revealing demographic biases wherein models align more with younger and wealthier individuals and underrepresent marginalized perspectives. By employing zero-shot and descriptive prompts, including table-based descriptions, the study demonstrates that prompt design can improve alignment for some models. Krippendorff's alpha indicates that LMs disagree less than humans, highlighting potential gaps in reflecting minority perspectives and calling for more inclusive evaluation and training. Overall, the work emphasizes responsible AI development that accounts for diverse values in normative judgments.

Abstract

This paper discusses and contains offensive content. Language models (LMs) are used in decision-making systems and as interactive assistants. However, how well do these models making judgements align with the diversity of human values, particularly regarding social and moral norms? In this work, we investigate how inclusively LMs perceive norms across demographic groups (e.g., gender, age, and income). We prompt 11 LMs on rules-of-thumb (RoTs) and compare their outputs with the existing responses of 100 human annotators. We introduce the Absolute Distance Alignment Metric (ADA-Met) to quantify alignment on ordinal questions. We find notable disparities in LM responses, with younger, higher-income groups showing closer alignment, raising concerns about the representation of marginalized perspectives. Our findings highlight the importance of further efforts to make LMs more inclusive of diverse human values. The code and prompts are available on GitHub under the CC BY-NC 4.0 license.

How Inclusively do LMs Perceive Social and Moral Norms?

TL;DR

The paper tackles how inclusively language models perceive social and moral norms across demographic groups, introducing the Absolute Distance Alignment Metric (ADA-Met) to quantify ordinal alignment with human judgments on Social Chemistry 101 RoTs. It benchmarks 11 LMs against 100 human annotators, revealing demographic biases wherein models align more with younger and wealthier individuals and underrepresent marginalized perspectives. By employing zero-shot and descriptive prompts, including table-based descriptions, the study demonstrates that prompt design can improve alignment for some models. Krippendorff's alpha indicates that LMs disagree less than humans, highlighting potential gaps in reflecting minority perspectives and calling for more inclusive evaluation and training. Overall, the work emphasizes responsible AI development that accounts for diverse values in normative judgments.

Abstract

This paper discusses and contains offensive content. Language models (LMs) are used in decision-making systems and as interactive assistants. However, how well do these models making judgements align with the diversity of human values, particularly regarding social and moral norms? In this work, we investigate how inclusively LMs perceive norms across demographic groups (e.g., gender, age, and income). We prompt 11 LMs on rules-of-thumb (RoTs) and compare their outputs with the existing responses of 100 human annotators. We introduce the Absolute Distance Alignment Metric (ADA-Met) to quantify alignment on ordinal questions. We find notable disparities in LM responses, with younger, higher-income groups showing closer alignment, raising concerns about the representation of marginalized perspectives. Our findings highlight the importance of further efforts to make LMs more inclusive of diverse human values. The code and prompts are available on GitHub under the CC BY-NC 4.0 license.

Paper Structure

This paper contains 36 sections, 5 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Rule of thumb definition, anticipated agreement question, and human and LM annotations.
  • Figure 2: Experimental pipeline of creating prompts, prompting LMs, extracting answers, and comparing LM-generated vs human responses.
  • Figure 3: LM alignment with demographic groups based on age, income, and gender. The circle positions correspond to demographic bins, rather than specific values.
  • Figure 4: Absolute distance alignment matrices allow for the comparison between demographic groups and LMs.
  • Figure 5: Zero-Shot Table histograms of ADA-Met values for different LMs across all RoTs. Arctic and Llama-3.1-405B align best with humans.