Table of Contents
Fetching ...

Auditing for Bias in Ad Delivery Using Inferred Demographic Attributes

Basileal Imana, Aleksandra Korolova, John Heidemann

TL;DR

This paper addresses the challenge of auditing bias in black-box ad delivery when demographic attributes are unavailable and must be inferred. It shows that demographic-inference errors can understate true algorithmic skew in paired-ad audits and presents an inference-aware skew correction that propagates inference error through to the delivery audience using a model with base delivery rate $R$ and skew $S$. The authors derive closed-form solutions for $R$ and $S$ from aggregate data and ground-truth-like error rates, and they validate the correction via simulations and real-data-derived error rates (e.g., BISG thresholds on North Carolina voter data). The results demonstrate that ignoring inference error can hide existing bias, while the proposed correction increases sensitivity and reliability of skew detection, enabling broader, more accurate external audits of ad-delivery bias in consequential domains. This work enhances auditing tools for platforms and regulators, suggesting careful consideration of inference error when evaluating demographic disparities in ad delivery across protected attributes.

Abstract

Auditing social-media algorithms has become a focus of public-interest research and policymaking to ensure their fairness across demographic groups such as race, age, and gender in consequential domains such as the presentation of employment opportunities. However, such demographic attributes are often unavailable to auditors and platforms. When demographics data is unavailable, auditors commonly infer them from other available information. In this work, we study the effects of inference error on auditing for bias in one prominent application: black-box audit of ad delivery using paired ads. We show that inference error, if not accounted for, causes auditing to falsely miss skew that exists. We then propose a way to mitigate the inference error when evaluating skew in ad delivery algorithms. Our method works by adjusting for expected error due to demographic inference, and it makes skew detection more sensitive when attributes must be inferred. Because inference is increasingly used for auditing, our results provide an important addition to the auditing toolbox to promote correct audits of ad delivery algorithms for bias. While the impact of attribute inference on accuracy has been studied in other domains, our work is the first to consider it for black-box evaluation of ad delivery bias, when only aggregate data is available to the auditor.

Auditing for Bias in Ad Delivery Using Inferred Demographic Attributes

TL;DR

This paper addresses the challenge of auditing bias in black-box ad delivery when demographic attributes are unavailable and must be inferred. It shows that demographic-inference errors can understate true algorithmic skew in paired-ad audits and presents an inference-aware skew correction that propagates inference error through to the delivery audience using a model with base delivery rate and skew . The authors derive closed-form solutions for and from aggregate data and ground-truth-like error rates, and they validate the correction via simulations and real-data-derived error rates (e.g., BISG thresholds on North Carolina voter data). The results demonstrate that ignoring inference error can hide existing bias, while the proposed correction increases sensitivity and reliability of skew detection, enabling broader, more accurate external audits of ad-delivery bias in consequential domains. This work enhances auditing tools for platforms and regulators, suggesting careful consideration of inference error when evaluating demographic disparities in ad delivery across protected attributes.

Abstract

Auditing social-media algorithms has become a focus of public-interest research and policymaking to ensure their fairness across demographic groups such as race, age, and gender in consequential domains such as the presentation of employment opportunities. However, such demographic attributes are often unavailable to auditors and platforms. When demographics data is unavailable, auditors commonly infer them from other available information. In this work, we study the effects of inference error on auditing for bias in one prominent application: black-box audit of ad delivery using paired ads. We show that inference error, if not accounted for, causes auditing to falsely miss skew that exists. We then propose a way to mitigate the inference error when evaluating skew in ad delivery algorithms. Our method works by adjusting for expected error due to demographic inference, and it makes skew detection more sensitive when attributes must be inferred. Because inference is increasingly used for auditing, our results provide an important addition to the auditing toolbox to promote correct audits of ad delivery algorithms for bias. While the impact of attribute inference on accuracy has been studied in other domains, our work is the first to consider it for black-box evaluation of ad delivery bias, when only aggregate data is available to the auditor.

Paper Structure

This paper contains 32 sections, 3 theorems, 7 equations, 3 figures, 4 tables.

Key Result

theorem 1

If an ad delivery algorithm is not skewed by a protected demographic attribute ($S = 1$), inference error does not affect the measurement of skew in ad delivery. Specifically, the skew an auditor measures is $0$ in both cases where the auditor targets using true attributes ($D_t = 0$) and inferred a

Figures (3)

  • Figure 1: Decoupling between attribute inference step and evaluation of skew in black-box auditing of ad delivery algorithms. Only aggregate size of demographic groups (no individual-level data) is available at the time of skew evaluation.
  • Figure 2: An illustration of how False Discovery Rates are calculated for an audience constructed with inferred race. All values shown are fractions of individuals.
  • Figure 3: Left column shows the effect of sample size on our inference-aware skew evaluation. Parameters: $R=0.065$, ${\textit{Th}}_{\hbox{\small BISG}} = 0.5$, ${\textit{FDR}}_{b,a}=0.4727,$${\textit{FDR}}_{o,a}=0.030$${\textit{FDR}}_{a,b}=0.144$, ${\textit{FDR}}_{o,b}=0.032$. The right column shows the effect of BISG inference error rates on our inference-aware evaluation. Parameters: $R=0.065$, $|U|=30,000$. In both columns, as sample size and BISG threshold increases (inference error decreases), the red shaded region where inference error leads to hiding skew that exists gets reduced.

Theorems & Definitions (3)

  • theorem 1
  • theorem 2
  • theorem 3