Table of Contents
Fetching ...

BIASINSPECTOR: Detecting Bias in Structured Data through LLM Agents

Haoxuan Li, Mingyu Derek Ma, Jen-tse Huang, Zhaotian Weng, Wei Wang, Jieyu Zhao

TL;DR

This work tackles the challenge of detecting biases in structured data, where prior automated methods struggle to generalize across data types and biases. It introduces BiasInspector, the first end-to-end, multi-agent framework that jointly plans, executes a diverse toolbox of bias-detection methods, and provides detailed visualizations and explanations. A new BiasBenchmark benchmark evaluates both end results and intermediate processes, showing BiasInspector achieves high accuracy (up to around 78% in bias-degree tasks) and robust performance across planning and tooling, especially when powered by GPT-4o. The framework’s extensible toolset and method library, coupled with a standardized evaluation protocol, offer a practical path toward fairer data workflows and set a benchmark for future LLM-agent bias detection research.

Abstract

Detecting biases in structured data is a complex and time-consuming task. Existing automated techniques are limited in diversity of data types and heavily reliant on human case-by-case handling, resulting in a lack of generalizability. Currently, large language model (LLM)-based agents have made significant progress in data science, but their ability to detect data biases is still insufficiently explored. To address this gap, we introduce the first end-to-end, multi-agent synergy framework, BIASINSPECTOR, designed for automatic bias detection in structured data based on specific user requirements. It first develops a multi-stage plan to analyze user-specified bias detection tasks and then implements it with a diverse and well-suited set of tools. It delivers detailed results that include explanations and visualizations. To address the lack of a standardized framework for evaluating the capability of LLM agents to detect biases in data, we further propose a comprehensive benchmark that includes multiple evaluation metrics and a large set of test cases. Extensive experiments demonstrate that our framework achieves exceptional overall performance in structured data bias detection, setting a new milestone for fairer data applications.

BIASINSPECTOR: Detecting Bias in Structured Data through LLM Agents

TL;DR

This work tackles the challenge of detecting biases in structured data, where prior automated methods struggle to generalize across data types and biases. It introduces BiasInspector, the first end-to-end, multi-agent framework that jointly plans, executes a diverse toolbox of bias-detection methods, and provides detailed visualizations and explanations. A new BiasBenchmark benchmark evaluates both end results and intermediate processes, showing BiasInspector achieves high accuracy (up to around 78% in bias-degree tasks) and robust performance across planning and tooling, especially when powered by GPT-4o. The framework’s extensible toolset and method library, coupled with a standardized evaluation protocol, offer a practical path toward fairer data workflows and set a benchmark for future LLM-agent bias detection research.

Abstract

Detecting biases in structured data is a complex and time-consuming task. Existing automated techniques are limited in diversity of data types and heavily reliant on human case-by-case handling, resulting in a lack of generalizability. Currently, large language model (LLM)-based agents have made significant progress in data science, but their ability to detect data biases is still insufficiently explored. To address this gap, we introduce the first end-to-end, multi-agent synergy framework, BIASINSPECTOR, designed for automatic bias detection in structured data based on specific user requirements. It first develops a multi-stage plan to analyze user-specified bias detection tasks and then implements it with a diverse and well-suited set of tools. It delivers detailed results that include explanations and visualizations. To address the lack of a standardized framework for evaluating the capability of LLM agents to detect biases in data, we further propose a comprehensive benchmark that includes multiple evaluation metrics and a large set of test cases. Extensive experiments demonstrate that our framework achieves exceptional overall performance in structured data bias detection, setting a new milestone for fairer data applications.

Paper Structure

This paper contains 49 sections, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Overview of the multi-agent architecture with a Primary and an Advisor Agent collaborating and invoking tools from the Toolset and Bias Detection Method Library.
  • Figure 2: Workflow overview: User Input, Data Preprocessing, Bias Detection and Analysis, Visualization and Summarization, and User Feedback. It is iterative rather than strictly sequential, allowing returns to previous stages based on user input or updated plans.
  • Figure 3: Intermediate process performance of four agent frameworks.
  • Figure 4: PrimaryAgentPrompt
  • Figure 5: AdvisorAgentPrompt
  • ...and 1 more figures