Trustworthy Representation Learning via Information Funnels and Bottlenecks

João Machado de Freitas; Bernhard C. Geiger

Trustworthy Representation Learning via Information Funnels and Bottlenecks

João Machado de Freitas, Bernhard C. Geiger

TL;DR

Trustworthy representation learning is challenged by the need to balance utility, fairness, and privacy. The authors introduce CPFSI, a Conditional Privacy Funnel with Side-Information, and derive amortized variational bounds to optimize a multi-objective Lagrangian that jointly minimizes information about the sensitive attribute while preserving information about the input and utility for a downstream task. CPFSI extends prior information-theoretic objectives (IB, IBSI, CPF, CFB) and supports both fully supervised and semi-supervised learning on tabular data, with the ability to intervene on the sensitive attribute at inference for counterfactual fairness. Empirical results across Adult, Dutch, Credit, and COMPAS show CPFSI achieves favorable utility-invariance-fidelity trade-offs, often outperforming baselines, and demonstrates practical fairness gains with relatively small labeled datasets. The work provides a principled framework for robust, fair representations in data-scarce settings and suggests promising avenues for future extensions to domain adaptation and broader modalities.

Abstract

Ensuring trustworthiness in machine learning -- by balancing utility, fairness, and privacy -- remains a critical challenge, particularly in representation learning. In this work, we investigate a family of closely related information-theoretic objectives, including information funnels and bottlenecks, designed to extract invariant representations from data. We introduce the Conditional Privacy Funnel with Side-information (CPFSI), a novel formulation within this family, applicable in both fully and semi-supervised settings. Given the intractability of these objectives, we derive neural-network-based approximations via amortized variational inference. We systematically analyze the trade-offs between utility, invariance, and representation fidelity, offering new insights into the Pareto frontiers of these methods. Our results demonstrate that CPFSI effectively balances these competing objectives and frequently outperforms existing approaches. Furthermore, we show that by intervening on sensitive attributes in CPFSI's predictive posterior enhances fairness while maintaining predictive performance. Finally, we focus on the real-world applicability of these approaches, particularly for learning robust and fair representations from tabular datasets in data scarce-environments -- a modality where these methods are often especially relevant.

Trustworthy Representation Learning via Information Funnels and Bottlenecks

TL;DR

Abstract

Paper Structure (24 sections, 17 equations, 17 figures, 1 table, 3 algorithms)

This paper contains 24 sections, 17 equations, 17 figures, 1 table, 3 algorithms.

Introduction
Related Work
Conditional Privacy-Funnel with Side-Information
Lagrangian Formulation
Variational Bounds
Conditional Predictive Posterior
Semi-supervised Learning
Funnels and Bottlenecks
Experimental Setup
Datasets
Models
Training and Evaluation
Metrics
Evaluating Representations
Fair Classification using the Predictive Posterior
...and 9 more sections

Figures (17)

Figure 1: The graphical model that captures the modeling and Markov relations of the CPFSI method and also of the IBSI chechik2002extracting. The solid lines represent the generative assumptions, while the dotted red arrow represents the encoding relation between the input $\mathbf x$ and the representation $\mathbf z$.
Figure 2: Representation's fairness on the Adult dataset. The dashed line marks the estimators --- Logistic Regression (top) and Random Forest (bottom) --- evaluated on the original features. The color scale represents the mean absolute reconstruction error from the Linear Regression and Random Forest Regressor from $z$ of a numerical feature of $x$, across configurations obtained by sweeping $\alpha$ and $\beta$, illustrating the three-way evaluation of utility, invariance, and representation fidelity.
Figure 3: Representation's fairness on the Dutch dataset.
Figure 4: Representation's fairness vs. privacy on the Adult dataset.
Figure 5: Comparison between representation fairness and predictive posteriors, for different interventions on the sensitive attribute $\mathbf s$, on the Adult dataset. The gray-dashed reference line belongs to a Logistic Regression estimator on the original data. Results for other datasets are in Appendix \ref{['sec:appendix:results']}.
...and 12 more figures

Theorems & Definitions (5)

Definition 1: CPFSI
Definition 2: Discrimination
Definition 3: C1R
Definition 4: C2R
Definition 5: HV, zitzler1999evolutionary

Trustworthy Representation Learning via Information Funnels and Bottlenecks

TL;DR

Abstract

Trustworthy Representation Learning via Information Funnels and Bottlenecks

Authors

TL;DR

Abstract

Table of Contents

Figures (17)

Theorems & Definitions (5)