Table of Contents
Fetching ...

Gender and Race Bias in Consumer Product Recommendations by Large Language Models

Ke Xu, Shera Potka, Alex Thomo

TL;DR

The study investigates gender and race biases in LLM-generated consumer product recommendations by applying prompt engineering to elicit demographic-specific outputs and analyzing them with three methods: Marked Words, SVM, and Jensen-Shannon Divergence. The integrated approach reveals consistent disparities in linguistic cues and product-category emphasis across race and gender groups, indicating implicit biases embedded in LLM-driven recommendations. Key contributions include a systematic bias-detection framework, empirical evidence of biases, and guidance for designing fairer, more inclusive recommendation systems. These findings have practical implications for developers and policymakers aiming to reduce bias in AI-powered shopping assistants and to improve fairness in personalized recommendations.

Abstract

Large Language Models are increasingly employed in generating consumer product recommendations, yet their potential for embedding and amplifying gender and race biases remains underexplored. This paper serves as one of the first attempts to examine these biases within LLM-generated recommendations. We leverage prompt engineering to elicit product suggestions from LLMs for various race and gender groups and employ three analytical methods-Marked Words, Support Vector Machines, and Jensen-Shannon Divergence-to identify and quantify biases. Our findings reveal significant disparities in the recommendations for demographic groups, underscoring the need for more equitable LLM recommendation systems.

Gender and Race Bias in Consumer Product Recommendations by Large Language Models

TL;DR

The study investigates gender and race biases in LLM-generated consumer product recommendations by applying prompt engineering to elicit demographic-specific outputs and analyzing them with three methods: Marked Words, SVM, and Jensen-Shannon Divergence. The integrated approach reveals consistent disparities in linguistic cues and product-category emphasis across race and gender groups, indicating implicit biases embedded in LLM-driven recommendations. Key contributions include a systematic bias-detection framework, empirical evidence of biases, and guidance for designing fairer, more inclusive recommendation systems. These findings have practical implications for developers and policymakers aiming to reduce bias in AI-powered shopping assistants and to improve fairness in personalized recommendations.

Abstract

Large Language Models are increasingly employed in generating consumer product recommendations, yet their potential for embedding and amplifying gender and race biases remains underexplored. This paper serves as one of the first attempts to examine these biases within LLM-generated recommendations. We leverage prompt engineering to elicit product suggestions from LLMs for various race and gender groups and employ three analytical methods-Marked Words, Support Vector Machines, and Jensen-Shannon Divergence-to identify and quantify biases. Our findings reveal significant disparities in the recommendations for demographic groups, underscoring the need for more equitable LLM recommendation systems.
Paper Structure (12 sections, 11 equations, 3 figures, 3 tables)

This paper contains 12 sections, 11 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Comparison of recommendations for demographic groups (I).
  • Figure 2: Comparison of recommendations for demographic groups (II).
  • Figure 3: Comparison of recommendations for demographic groups (III).