Table of Contents
Fetching ...

DR.GAP: Mitigating Bias in Large Language Models using Gender-Aware Prompting with Demonstration and Reasoning

Hongye Qiu, Yue Xu, Meikang Qiu, Wenjie Wang

TL;DR

DR.GAP addresses the challenge of mitigating gender bias in LLMs without sacrificing task performance by auto-selecting bias-revealing demonstrations and generating structured, gender-neutral reasoning via a reference-model guided pipeline. The method is model-agnostic and extends to vision-language models, with experiments across multiple LLMs and VLMs showing substantial bias reduction on coreference and QA tasks while preserving utility. Key contributions include automatic demonstration selection, a four-module reasoning workflow (Initial Reasoning, Verification, Gender-Independent Filtering, Iterative Refinement), and formalization of prompts. The work advances fairer AI by offering a scalable debiasing approach applicable to open-source and black-box systems, with robust generalization across datasets and tasks.

Abstract

Large Language Models (LLMs) exhibit strong natural language processing capabilities but also inherit and amplify societal biases, including gender bias, raising fairness concerns. Existing debiasing methods face significant limitations: parameter tuning requires access to model weights, prompt-based approaches often degrade model utility, and optimization-based techniques lack generalizability. To address these challenges, we propose DR.GAP (Demonstration and Reasoning for Gender-Aware Prompting), an automated and model-agnostic approach that mitigates gender bias while preserving model performance. DR.GAP selects bias-revealing examples and generates structured reasoning to guide models toward more impartial responses. Extensive experiments on coreference resolution and QA tasks across multiple LLMs (GPT-3.5, Llama3, and Llama2-Alpaca) demonstrate its effectiveness, generalization ability, and robustness. DR.GAP can generalize to vision-language models (VLMs), achieving significant bias reduction.

DR.GAP: Mitigating Bias in Large Language Models using Gender-Aware Prompting with Demonstration and Reasoning

TL;DR

DR.GAP addresses the challenge of mitigating gender bias in LLMs without sacrificing task performance by auto-selecting bias-revealing demonstrations and generating structured, gender-neutral reasoning via a reference-model guided pipeline. The method is model-agnostic and extends to vision-language models, with experiments across multiple LLMs and VLMs showing substantial bias reduction on coreference and QA tasks while preserving utility. Key contributions include automatic demonstration selection, a four-module reasoning workflow (Initial Reasoning, Verification, Gender-Independent Filtering, Iterative Refinement), and formalization of prompts. The work advances fairer AI by offering a scalable debiasing approach applicable to open-source and black-box systems, with robust generalization across datasets and tasks.

Abstract

Large Language Models (LLMs) exhibit strong natural language processing capabilities but also inherit and amplify societal biases, including gender bias, raising fairness concerns. Existing debiasing methods face significant limitations: parameter tuning requires access to model weights, prompt-based approaches often degrade model utility, and optimization-based techniques lack generalizability. To address these challenges, we propose DR.GAP (Demonstration and Reasoning for Gender-Aware Prompting), an automated and model-agnostic approach that mitigates gender bias while preserving model performance. DR.GAP selects bias-revealing examples and generates structured reasoning to guide models toward more impartial responses. Extensive experiments on coreference resolution and QA tasks across multiple LLMs (GPT-3.5, Llama3, and Llama2-Alpaca) demonstrate its effectiveness, generalization ability, and robustness. DR.GAP can generalize to vision-language models (VLMs), achieving significant bias reduction.

Paper Structure

This paper contains 29 sections, 5 figures, 10 tables.

Figures (5)

  • Figure 1: The pipeline of DR.GAP. Step1: Generate representative dataset that reveal gender bias in target LLM, where the answer is incorrect on target LLM but correct on reference LLM. Step2: Generate the reasoning and demonstration to focus on semantic information rather than gender-specific details, with Initial Reasoning, Reasoning verification, Gender-independent Filtering and Iterative Refinement. Step3: Select the reasoning among each steps that most effectively mitigate of gender bias on the development set as the system prompt.
  • Figure 2: Illustrating the performance of different methods on the GPT-3.5, Llama3, and Llama2-Alpaca in terms of bias mitigation ($\Delta Bias$) on the x-axis and accuracy changes ($\Delta Acc$) on the y-axis. Different colors are used to distinguish among the methods, while different shapes represent various datasets. The symbol $\star$ denotes the center of the ellipse, which reflects the overall performance of the method across the datasets.
  • Figure 3: Generalization ability of DR.GAP on debiasing effects across different datasets, with the best highlighted with blue edges. The x-axis represents the source datasets for reasoning, and the y-axis indicates the target datasets for evaluation.
  • Figure 4: The resolution accuracy and bias for VisoGender in Qwen2-VL, InstructBlip, and Llava-1.5 models with different system prompts.
  • Figure 5: Detailed results of VLMs on VisoGender dataset by category. Single, Two, Same, and Diff denote scenes with one person, two people, same-gender pairs, and different-gender pairs, respectively.