Table of Contents
Fetching ...

AI Fairness Beyond Complete Demographics: Current Achievements and Future Directions

Zichong Wang, Zhipeng Yin, Roland H. C. Yap, Wenbin Zhang

TL;DR

Fairness in AI under incomplete demographics is analyzed by introducing a taxonomy of notions and surveying six methodological families for achieving fairness without full demographic data. The paper formalizes Rawlsian, group, counterfactual, proxy, individual, and unawareness notions and explains how each can be operationalized via methods such as DRO, adversarial learning, proxy demographics, third-party auditing, and graph-specific techniques. It compiles benchmark datasets for both IID and non-IID settings and discusses challenges including dataset realism, causal explanations of unfairness, and trade-offs between utility and fairness, with suggestions to leverage LLMs for demographic inference. The work provides a structured blueprint for advancing fairness research in settings where demographic information is incomplete or legally restricted, with practical implications for policy-compliant AI systems.

Abstract

Fairness in artificial intelligence (AI) has become a growing concern due to discriminatory outcomes in AI-based decision-making systems. While various methods have been proposed to mitigate bias, most rely on complete demographic information, an assumption often impractical due to legal constraints and the risk of reinforcing discrimination. This survey examines fairness in AI when demographics are incomplete, addressing the gap between traditional approaches and real-world challenges. We introduce a novel taxonomy of fairness notions in this setting, clarifying their relationships and distinctions. Additionally, we summarize existing techniques that promote fairness beyond complete demographics and highlight open research questions to encourage further progress in the field.

AI Fairness Beyond Complete Demographics: Current Achievements and Future Directions

TL;DR

Fairness in AI under incomplete demographics is analyzed by introducing a taxonomy of notions and surveying six methodological families for achieving fairness without full demographic data. The paper formalizes Rawlsian, group, counterfactual, proxy, individual, and unawareness notions and explains how each can be operationalized via methods such as DRO, adversarial learning, proxy demographics, third-party auditing, and graph-specific techniques. It compiles benchmark datasets for both IID and non-IID settings and discusses challenges including dataset realism, causal explanations of unfairness, and trade-offs between utility and fairness, with suggestions to leverage LLMs for demographic inference. The work provides a structured blueprint for advancing fairness research in settings where demographic information is incomplete or legally restricted, with practical implications for policy-compliant AI systems.

Abstract

Fairness in artificial intelligence (AI) has become a growing concern due to discriminatory outcomes in AI-based decision-making systems. While various methods have been proposed to mitigate bias, most rely on complete demographic information, an assumption often impractical due to legal constraints and the risk of reinforcing discrimination. This survey examines fairness in AI when demographics are incomplete, addressing the gap between traditional approaches and real-world challenges. We introduce a novel taxonomy of fairness notions in this setting, clarifying their relationships and distinctions. Additionally, we summarize existing techniques that promote fairness beyond complete demographics and highlight open research questions to encourage further progress in the field.

Paper Structure

This paper contains 18 sections, 4 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: A taxonomy of the commonly used techniques to improve fairness with incomplete demographic information.