Table of Contents
Fetching ...

Estimating Fair Graphs from Graph-Stationary Data

Madeline Navarro, Andrei Buciulea, Samuel Rey, Antonio G. Marques, Santiago Segarra

TL;DR

This work tackles the problem of inferring graphs from graph-stationary nodal data while enforcing unbiased edge connections with respect to sensitive attributes. It introduces two fairness notions—group fairness and individual fairness—for dyadic connectivity, and defines spectral-domain bias metrics $R_G(S)$ and $R_N(S)$ to quantify topological bias. The authors propose FairSpectralTemplatel (FairSpecTemp) with two variants: a convex-relaxation approach enforcing commutativity with the sample covariance and a bias constraint, and a shared-eigenbasis variant that implicitly promotes fairness via spectral alignment; both come with high-probability performance guarantees that reveal a conditional fairness-accuracy tradeoff. Empirical results on synthetic and real datasets, including financial investing scenarios, demonstrate that imposing fairness can reduce bias without sacrificing accuracy when the target graph is fair, and that the two FairSpecTemp variants offer complementary strengths depending on sample size and bias level.

Abstract

We estimate fair graphs from graph-stationary nodal observations such that connections are not biased with respect to sensitive attributes. Edges in real-world graphs often exhibit preferences for connecting certain pairs of groups. Biased connections can not only exacerbate but even induce unfair treatment for downstream graph-based tasks. We therefore consider group and individual fairness for graphs corresponding to group- and node-level definitions, respectively. To evaluate the fairness of a given graph, we provide multiple bias metrics, including novel measurements in the spectral domain. Furthermore, we propose Fair Spectral Templates (FairSpecTemp), an optimization-based method with two variants for estimating fair graphs from stationary graph signals, a general model for graph data subsuming many existing ones. One variant of FairSpecTemp exploits commutativity properties of graph stationarity while directly constraining bias, while the other implicitly encourages fair estimates by restricting bias in the graph spectrum and is thus more flexible. Our methods enjoy high probability performance bounds, yielding a conditional tradeoff between fairness and accuracy. In particular, our analysis reveals that accuracy need not be sacrificed to recover fair graphs. We evaluate FairSpecTemp on synthetic and real-world data sets to illustrate its effectiveness and highlight the advantages of both variants of FairSpecTemp.

Estimating Fair Graphs from Graph-Stationary Data

TL;DR

This work tackles the problem of inferring graphs from graph-stationary nodal data while enforcing unbiased edge connections with respect to sensitive attributes. It introduces two fairness notions—group fairness and individual fairness—for dyadic connectivity, and defines spectral-domain bias metrics and to quantify topological bias. The authors propose FairSpectralTemplatel (FairSpecTemp) with two variants: a convex-relaxation approach enforcing commutativity with the sample covariance and a bias constraint, and a shared-eigenbasis variant that implicitly promotes fairness via spectral alignment; both come with high-probability performance guarantees that reveal a conditional fairness-accuracy tradeoff. Empirical results on synthetic and real datasets, including financial investing scenarios, demonstrate that imposing fairness can reduce bias without sacrificing accuracy when the target graph is fair, and that the two FairSpecTemp variants offer complementary strengths depending on sample size and bias level.

Abstract

We estimate fair graphs from graph-stationary nodal observations such that connections are not biased with respect to sensitive attributes. Edges in real-world graphs often exhibit preferences for connecting certain pairs of groups. Biased connections can not only exacerbate but even induce unfair treatment for downstream graph-based tasks. We therefore consider group and individual fairness for graphs corresponding to group- and node-level definitions, respectively. To evaluate the fairness of a given graph, we provide multiple bias metrics, including novel measurements in the spectral domain. Furthermore, we propose Fair Spectral Templates (FairSpecTemp), an optimization-based method with two variants for estimating fair graphs from stationary graph signals, a general model for graph data subsuming many existing ones. One variant of FairSpecTemp exploits commutativity properties of graph stationarity while directly constraining bias, while the other implicitly encourages fair estimates by restricting bias in the graph spectrum and is thus more flexible. Our methods enjoy high probability performance bounds, yielding a conditional tradeoff between fairness and accuracy. In particular, our analysis reveals that accuracy need not be sacrificed to recover fair graphs. We evaluate FairSpecTemp on synthetic and real-world data sets to illustrate its effectiveness and highlight the advantages of both variants of FairSpecTemp.

Paper Structure

This paper contains 37 sections, 64 equations, 4 figures.

Figures (4)

  • Figure 1: Target ${\mathbf S}^*$ as ${ R_{ \rm G} }({\mathbf S}^*)$ and ${ R_{ \rm N} }({\mathbf S}^*)$ vary. Red denotes node pairs in group 1, blue node pairs in group 2, and gray node pairs in different groups. Both ${ R_{ \rm G} }({\mathbf S}^*)$ and ${ R_{ \rm N} }({\mathbf S}^*)$ vary in (a)-(c), while ${ R_{ \rm G} }({\mathbf S}^*)$ is low but ${ R_{ \rm N} }({\mathbf S}^*)$ varies in (d)-(f). (a) Majority of edges connect the same group (${ R_{ \rm G} }({\mathbf S}^*)$ and ${ R_{ \rm N} }({\mathbf S}^*)$ high). (b) Balanced number of edges within and across groups (${ R_{ \rm G} }({\mathbf S}^*)$ and ${ R_{ \rm N} }({\mathbf S}^*)$ low). (c) Majority of edges connect different groups (${ R_{ \rm G} }({\mathbf S}^*)$ and ${ R_{ \rm N} }({\mathbf S}^*)$ high). (d) Subgroups of nodes show strong within- or across-group preferences (${ R_{ \rm G} }({\mathbf S}^*)$ low, ${ R_{ \rm N} }({\mathbf S}^*)$ high). (e) Every node has relatively balanced edges across groups (${ R_{ \rm G} }({\mathbf S}^*)$ and ${ R_{ \rm N} }({\mathbf S}^*)$ low). (f) Alternate subgroups of nodes show strong within- or across-group preferences (${ R_{ \rm G} }({\mathbf S}^*)$ low, ${ R_{ \rm N} }({\mathbf S}^*)$ high).
  • Figure 2: Top row: Performance of estimates ${\hat{\mathbf S} }$ as ${ R_{ \rm G} }({\mathbf S}^*)$ and ${ R_{ \rm N} }({\mathbf S}^*)$ vary corresponding to Figure \ref{['fig:synth_topbias_vis']}(a)-(c). Bottom row: Performance of estimates ${\hat{\mathbf S} }$ for low ${ R_{ \rm G} }({\mathbf S}^*)$ as ${ R_{ \rm N} }({\mathbf S}^*)$ varies corresponding to Figure \ref{['fig:synth_topbias_vis']}(d)-(f). Left column: (a) and (d) depict estimation error $d({\hat{\mathbf S} },{\mathbf S}^*)$. Middle column: (b) and (e) show group-wise bias $b_{\rm G}({\hat{\mathbf S} })$. Right column: (c) and (f) show node-wise bias $b_{\rm N}({\hat{\mathbf S} })$.
  • Figure 3: Performance of estimates ${\hat{\mathbf S} }$ for different graph estimation methods under varying conditions. Bias $b_{\rm G}({\hat{\mathbf S} })$ and error $d({\hat{\mathbf S} },{\mathbf S}^*)$ (a) as the number of nodes $N$ increases, (b) as the data becomes more biased toward within-group connections, (c) as the number of samples $M$ increases.
  • Figure 4: Investment value and group-wise bias $b_{\rm G}({\hat{\mathbf S} })$ of estimated graphs ${\hat{\mathbf S} }$ over time: (a) three months estimating graphs every other day and (b) two years and five months with weekly graph estimation.