Table of Contents
Fetching ...

Understanding and Mitigating the Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks

Miaomiao Li, Hao Chen, Yang Wang, Tingyuan Zhu, Weijia Zhang, Kaijie Zhu, Kam-Fai Wong, Jindong Wang

TL;DR

The paper investigates bias inheritance when synthetic data generated by LLMs is used to fine-tune downstream models. It introduces a six-type, multi-dimensional bias generation framework, studies gender and cultural biases across ten tasks at varying bias ratios, and identifies misalignment in values, groups, and data distributions as key drivers. Three mitigation strategies—token-based, mask-based, and loss-based—are proposed and evaluated, with results showing nuanced, task-dependent effectiveness and ongoing challenges. The findings underscore the importance of careful bias-aware data augmentation design and pave the way for more robust, fair downstream systems. The work provides practical insights for researchers and practitioners employing LLM-based augmentation in real-world settings.

Abstract

Generating synthetic datasets via large language models (LLMs) themselves has emerged as a promising approach to improve LLM performance. However, LLMs inherently reflect biases present in their training data, leading to a critical challenge: when these models generate synthetic data for training, they may propagate and amplify their inherent biases that can significantly impact model fairness and robustness on downstream tasks--a phenomenon we term bias inheritance. This work presents the first systematic investigation in understanding, analyzing, and mitigating bias inheritance. We study this problem by fine-tuning LLMs with a combined dataset consisting of original and LLM-augmented data, where bias ratio represents the proportion of augmented data. Through systematic experiments across 10 classification and generation tasks, we analyze how 6 different types of biases manifest at varying bias ratios. Our results reveal that bias inheritance has nuanced effects on downstream tasks, influencing both classification tasks and generation tasks differently. Then, our analysis identifies three key misalignment factors: misalignment of values, group data, and data distributions. Based on these insights, we propose three mitigation strategies: token-based, mask-based, and loss-based approaches. Experiments demonstrate that these strategies also work differently on various tasks and bias, indicating the substantial challenges to fully mitigate bias inheritance. We hope this work can provide valuable insights to the research of LLM data augmentation.

Understanding and Mitigating the Bias Inheritance in LLM-based Data Augmentation on Downstream Tasks

TL;DR

The paper investigates bias inheritance when synthetic data generated by LLMs is used to fine-tune downstream models. It introduces a six-type, multi-dimensional bias generation framework, studies gender and cultural biases across ten tasks at varying bias ratios, and identifies misalignment in values, groups, and data distributions as key drivers. Three mitigation strategies—token-based, mask-based, and loss-based—are proposed and evaluated, with results showing nuanced, task-dependent effectiveness and ongoing challenges. The findings underscore the importance of careful bias-aware data augmentation design and pave the way for more robust, fair downstream systems. The work provides practical insights for researchers and practitioners employing LLM-based augmentation in real-world settings.

Abstract

Generating synthetic datasets via large language models (LLMs) themselves has emerged as a promising approach to improve LLM performance. However, LLMs inherently reflect biases present in their training data, leading to a critical challenge: when these models generate synthetic data for training, they may propagate and amplify their inherent biases that can significantly impact model fairness and robustness on downstream tasks--a phenomenon we term bias inheritance. This work presents the first systematic investigation in understanding, analyzing, and mitigating bias inheritance. We study this problem by fine-tuning LLMs with a combined dataset consisting of original and LLM-augmented data, where bias ratio represents the proportion of augmented data. Through systematic experiments across 10 classification and generation tasks, we analyze how 6 different types of biases manifest at varying bias ratios. Our results reveal that bias inheritance has nuanced effects on downstream tasks, influencing both classification tasks and generation tasks differently. Then, our analysis identifies three key misalignment factors: misalignment of values, group data, and data distributions. Based on these insights, we propose three mitigation strategies: token-based, mask-based, and loss-based approaches. Experiments demonstrate that these strategies also work differently on various tasks and bias, indicating the substantial challenges to fully mitigate bias inheritance. We hope this work can provide valuable insights to the research of LLM data augmentation.

Paper Structure

This paper contains 37 sections, 1 equation, 19 figures, 6 tables.

Figures (19)

  • Figure 1: The overview of our research pipeline. (a) Six key types of bias for data generation with the key properties underlined. (b) Two popular categories of bias that may affect downstream tasks. (c) Our framework to augment LLMs and mitigate bias at downstream.
  • Figure 2: Results on downstream tasks related to gender with different types of bias in augmentation data. Bias in augmented data improves the performance of majority groups, yet deteriorates the performance for minority groups, resulting in a wider gap.
  • Figure 3: Results for bias indirectly and directly related tasks (x-axis: 0-Unbiased, 1-Contextual Single Explicit, 2-Contextual Intersectional Explicit, 3-Contextual Implicit, 4-Contrastive Single Explicit, 5-Contrastive Intersectional Explicit, and 6-Contrastive Implicit). Performance improves with lower bias proportion at bias indirectly-related tasks, yet generally decreases at bias directly-related tasks.
  • Figure 4: The average hiring recommendations results. Increase of male candidates in minority races is more pronounced than female.
  • Figure 5: The average story generation results and the multi-Round hiring recommendation results. Bias inheritance gets amplified over multiple rounds and eventually extends to majority groups.
  • ...and 14 more figures