AI Fairness Beyond Complete Demographics: Current Achievements and Future Directions
Zichong Wang, Zhipeng Yin, Roland H. C. Yap, Wenbin Zhang
TL;DR
Fairness in AI under incomplete demographics is analyzed by introducing a taxonomy of notions and surveying six methodological families for achieving fairness without full demographic data. The paper formalizes Rawlsian, group, counterfactual, proxy, individual, and unawareness notions and explains how each can be operationalized via methods such as DRO, adversarial learning, proxy demographics, third-party auditing, and graph-specific techniques. It compiles benchmark datasets for both IID and non-IID settings and discusses challenges including dataset realism, causal explanations of unfairness, and trade-offs between utility and fairness, with suggestions to leverage LLMs for demographic inference. The work provides a structured blueprint for advancing fairness research in settings where demographic information is incomplete or legally restricted, with practical implications for policy-compliant AI systems.
Abstract
Fairness in artificial intelligence (AI) has become a growing concern due to discriminatory outcomes in AI-based decision-making systems. While various methods have been proposed to mitigate bias, most rely on complete demographic information, an assumption often impractical due to legal constraints and the risk of reinforcing discrimination. This survey examines fairness in AI when demographics are incomplete, addressing the gap between traditional approaches and real-world challenges. We introduce a novel taxonomy of fairness notions in this setting, clarifying their relationships and distinctions. Additionally, we summarize existing techniques that promote fairness beyond complete demographics and highlight open research questions to encourage further progress in the field.
