The Generation Gap: Exploring Age Bias in the Value Systems of Large Language Models

Siyang Liu; Trish Maturi; Bowen Yi; Siqi Shen; Rada Mihalcea

The Generation Gap: Exploring Age Bias in the Value Systems of Large Language Models

Siyang Liu, Trish Maturi, Bowen Yi, Siqi Shen, Rada Mihalcea

TL;DR

A general inclination of LLM values towards younger demographics is found, especially when compared to the US population, but it is found that this inclination toward younger groups can be different across different value categories.

Abstract

We explore the alignment of values in Large Language Models (LLMs) with specific age groups, leveraging data from the World Value Survey across thirteen categories. Through a diverse set of prompts tailored to ensure response robustness, we find a general inclination of LLM values towards younger demographics, especially when compared to the US population. Although a general inclination can be observed, we also found that this inclination toward younger groups can be different across different value categories. Additionally, we explore the impact of incorporating age identity information in prompts and observe challenges in mitigating value discrepancies with different age cohorts. Our findings highlight the age bias in LLMs and provide insights for future work. Materials for our analysis are available at \url{ https://github.com/MichiganNLP/Age-Bias-In-LLMs}

The Generation Gap: Exploring Age Bias in the Value Systems of Large Language Models

TL;DR

Abstract

Paper Structure (30 sections, 6 equations, 17 figures, 5 tables)

This paper contains 30 sections, 6 equations, 17 figures, 5 tables.

Introduction
Related Work
Analytic Method
Human Data Acquisition
Dataset.
Prompting
Models.
Prompts.
Measures
Aligning with Which Age on Which Values?
Trend Observation.
Case Study.
The Effect of Adding Identity in Prompts
Prompt Adjustment.
Observation on Gap Change.
...and 15 more sections

Figures (17)

Figure 1: Age-related bias in LLMs on thirteen human value categories. Human values in this figure refer in particular to the US groups. Trend coefficients (see calculation in Sec \ref{['subsec:measures']}) were derived from the slope of the changing gap between LLM and human values as age increases. A positive trend coefficient signifies the widening gap observed from younger to older groups, thus indicating a model leaning towards younger age groups. The significance test is detailed in Appx \ref{['appx:significant']}.
Figure 2: Alignment rank of values of LLMs over different age groups in specific Countries. See results on more models and countries in Appendix \ref{['sec:appendix_otherllms']} and \ref{['sec:appendix_othercountries']} . Rank 1 on a specific age group means that this age group has the narrowest gap with LLM in values. An increasing monoticity indicates a closer alignment towards younger groups.
Figure 3: Two WVS prompts and their responses from LLMs and humans (in purple).
Figure 4: Change of Euclidean distance after adding identity information. The compared data is from values of ChatGPT and humans from different age groups in the US.
Figure 5: Value Pyramid of U.S population (left) and ChatGPT (right) for an inquiry on the frequency of using radio news.
...and 12 more figures

The Generation Gap: Exploring Age Bias in the Value Systems of Large Language Models

TL;DR

Abstract

The Generation Gap: Exploring Age Bias in the Value Systems of Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (17)