Table of Contents
Fetching ...

Exploring Accuracy-Fairness Trade-off in Large Language Models

Qingquan Zhang, Qiqi Duan, Bo Yuan, Yuhui Shi, Jialin Liu

TL;DR

The investigation reveals that multi-objective evolutionary learning (MOEL) methodologies offer promising avenues for tackling the intricate challenge of harmonising accuracy and fairness in the enhancement of LLMs.

Abstract

Large Language Models (LLMs) have made significant strides in the field of artificial intelligence, showcasing their ability to interact with humans and influence human cognition through information dissemination. However, recent studies have brought to light instances of bias inherent within these LLMs, presenting a critical issue that demands attention. In our research, we delve deeper into the intricate challenge of harmonising accuracy and fairness in the enhancement of LLMs. While improving accuracy can indeed enhance overall LLM performance, it often occurs at the expense of fairness. Overemphasising optimisation of one metric invariably leads to a significant degradation of the other. This underscores the necessity of taking into account multiple considerations during the design and optimisation phases of LLMs. Therefore, we advocate for reformulating the LLM training process as a multi-objective learning task. Our investigation reveals that multi-objective evolutionary learning (MOEL) methodologies offer promising avenues for tackling this challenge. Our MOEL framework enables the simultaneous optimisation of both accuracy and fairness metrics, resulting in a Pareto-optimal set of LLMs. In summary, our study sheds valuable lights on the delicate equilibrium between accuracy and fairness within LLMs, which is increasingly significant for their real-world applications. By harnessing MOEL, we present a promising pathway towards fairer and more efficacious AI technologies.

Exploring Accuracy-Fairness Trade-off in Large Language Models

TL;DR

The investigation reveals that multi-objective evolutionary learning (MOEL) methodologies offer promising avenues for tackling the intricate challenge of harmonising accuracy and fairness in the enhancement of LLMs.

Abstract

Large Language Models (LLMs) have made significant strides in the field of artificial intelligence, showcasing their ability to interact with humans and influence human cognition through information dissemination. However, recent studies have brought to light instances of bias inherent within these LLMs, presenting a critical issue that demands attention. In our research, we delve deeper into the intricate challenge of harmonising accuracy and fairness in the enhancement of LLMs. While improving accuracy can indeed enhance overall LLM performance, it often occurs at the expense of fairness. Overemphasising optimisation of one metric invariably leads to a significant degradation of the other. This underscores the necessity of taking into account multiple considerations during the design and optimisation phases of LLMs. Therefore, we advocate for reformulating the LLM training process as a multi-objective learning task. Our investigation reveals that multi-objective evolutionary learning (MOEL) methodologies offer promising avenues for tackling this challenge. Our MOEL framework enables the simultaneous optimisation of both accuracy and fairness metrics, resulting in a Pareto-optimal set of LLMs. In summary, our study sheds valuable lights on the delicate equilibrium between accuracy and fairness within LLMs, which is increasingly significant for their real-world applications. By harnessing MOEL, we present a promising pathway towards fairer and more efficacious AI technologies.

Paper Structure

This paper contains 13 sections, 3 equations, 5 figures, 1 algorithm.

Figures (5)

  • Figure 1: Overview of our framework.
  • Figure 2: Averaged HV values of our method on BiasBios task.
  • Figure 3: Non-dominated LLMs obtained by our framework to indicate the trade-off between error and fairness.
  • Figure 4: Comparison with six algorithms considering 10 trials.
  • Figure 5: Pareto Front obtained from randomly selected three trials using our framework.