Table of Contents
Fetching ...

Prompting Techniques for Reducing Social Bias in LLMs through System 1 and System 2 Cognitive Processes

Mahammed Kamruzzaman, Gene Louis Kim

TL;DR

The paper investigates prompting techniques inspired by dual-process theory (System 1 vs System 2) to reduce social biases in LLMs. By combining CoT reasoning, fast/slow prompts, and human or machine personas across nine bias categories and five LLMs, it demonstrates that human personas and System 2 cues—especially when paired with explicit debiasing prompts—consistently lower stereotypical judgments, with model- and bias-specific variation. The study provides practical guidance for bias mitigation via prefix-based prompts suitable for closed and open models, and highlights the strongest gains when System 2 prompts are augmented with a human persona and debiasing instructions. Limitations include the mapping of cognitive theories to LLMs, English-only datasets, and unassessed trade-offs with task performance, suggesting avenues for broader multilingual evaluations and integrated bias-quality assessments.

Abstract

Dual process theory posits that human cognition arises via two systems. System 1, which is a quick, emotional, and intuitive process, which is subject to cognitive biases, and System 2, is a slow, onerous, and deliberate process. Prior research in LLMs found that using chain-of-thought (CoT) prompting in LLMs, which has been often compared to System 2 reasoning, can lead to reduced gender bias. Along these lines, we investigate the relationship between bias, CoT prompting, a direct debiasing, and dual process theory modeling in LLMs. We compare zero-shot CoT, debiasing, and dual process theory-based prompting strategies on two bias datasets spanning nine different social bias categories. We incorporate human and machine personas to determine whether LLM modeling of the effects of dual process theory exist independent of explicit persona models or are tied to the LLM's modeling of human-like generation. We find that a human persona, debiasing, System 2, and CoT prompting all tend to reduce social biases in LLMs, though the best combination of features depends on the exact model and bias category -- resulting in up to a 33 percent drop in stereotypical judgments by an LLM.

Prompting Techniques for Reducing Social Bias in LLMs through System 1 and System 2 Cognitive Processes

TL;DR

The paper investigates prompting techniques inspired by dual-process theory (System 1 vs System 2) to reduce social biases in LLMs. By combining CoT reasoning, fast/slow prompts, and human or machine personas across nine bias categories and five LLMs, it demonstrates that human personas and System 2 cues—especially when paired with explicit debiasing prompts—consistently lower stereotypical judgments, with model- and bias-specific variation. The study provides practical guidance for bias mitigation via prefix-based prompts suitable for closed and open models, and highlights the strongest gains when System 2 prompts are augmented with a human persona and debiasing instructions. Limitations include the mapping of cognitive theories to LLMs, English-only datasets, and unassessed trade-offs with task performance, suggesting avenues for broader multilingual evaluations and integrated bias-quality assessments.

Abstract

Dual process theory posits that human cognition arises via two systems. System 1, which is a quick, emotional, and intuitive process, which is subject to cognitive biases, and System 2, is a slow, onerous, and deliberate process. Prior research in LLMs found that using chain-of-thought (CoT) prompting in LLMs, which has been often compared to System 2 reasoning, can lead to reduced gender bias. Along these lines, we investigate the relationship between bias, CoT prompting, a direct debiasing, and dual process theory modeling in LLMs. We compare zero-shot CoT, debiasing, and dual process theory-based prompting strategies on two bias datasets spanning nine different social bias categories. We incorporate human and machine personas to determine whether LLM modeling of the effects of dual process theory exist independent of explicit persona models or are tied to the LLM's modeling of human-like generation. We find that a human persona, debiasing, System 2, and CoT prompting all tend to reduce social biases in LLMs, though the best combination of features depends on the exact model and bias category -- resulting in up to a 33 percent drop in stereotypical judgments by an LLM.
Paper Structure (54 sections, 4 figures, 10 tables)

This paper contains 54 sections, 4 figures, 10 tables.

Figures (4)

  • Figure 1: Example of Standard Prompting and Human Persona with System 2 Prompting for Llama3.3 model in the race bias category
  • Figure 2: Stereotypical Responses for each prompt, average across all the models and bias types. Here, MP stands for Machine Persona, HP stands for Human Persona.
  • Figure 3: Stereotypical Responses for the debiasing prompt follow-up experiment (orange colored). The blue colored bars are anchors from \ref{['fig:prompts_result_average']} for easy comparison.
  • Figure 4: Results with Standard Prompts and best-performing (in terms of least stereotypical engagement) prompts for each bias category and all the LLMs. Here, MP stands for Machine Persona, HP stands for Human Persona.