Table of Contents
Fetching ...

Optimal Classification under Performative Distribution Shift

Edwige Cyffers, Muni Sreenivas Pydi, Jamal Atif, Olivier Cappé

TL;DR

A novel view in which these performative effects are modelled as push-forward measures is proposed, which encompasses existing models and enables novel performative gradient estimation methods, leading to more efficient and scalable learning strategies.

Abstract

Performative learning addresses the increasingly pervasive situations in which algorithmic decisions may induce changes in the data distribution as a consequence of their public deployment. We propose a novel view in which these performative effects are modelled as push-forward measures. This general framework encompasses existing models and enables novel performative gradient estimation methods, leading to more efficient and scalable learning strategies. For distribution shifts, unlike previous models which require full specification of the data distribution, we only assume knowledge of the shift operator that represents the performative changes. This approach can also be integrated into various change-of-variablebased models, such as VAEs or normalizing flows. Focusing on classification with a linear-in-parameters performative effect, we prove the convexity of the performative risk under a new set of assumptions. Notably, we do not limit the strength of performative effects but rather their direction, requiring only that classification becomes harder when deploying more accurate models. In this case, we also establish a connection with adversarially robust classification by reformulating the minimization of the performative risk as a min-max variational problem. Finally, we illustrate our approach on synthetic and real datasets.

Optimal Classification under Performative Distribution Shift

TL;DR

A novel view in which these performative effects are modelled as push-forward measures is proposed, which encompasses existing models and enables novel performative gradient estimation methods, leading to more efficient and scalable learning strategies.

Abstract

Performative learning addresses the increasingly pervasive situations in which algorithmic decisions may induce changes in the data distribution as a consequence of their public deployment. We propose a novel view in which these performative effects are modelled as push-forward measures. This general framework encompasses existing models and enables novel performative gradient estimation methods, leading to more efficient and scalable learning strategies. For distribution shifts, unlike previous models which require full specification of the data distribution, we only assume knowledge of the shift operator that represents the performative changes. This approach can also be integrated into various change-of-variablebased models, such as VAEs or normalizing flows. Focusing on classification with a linear-in-parameters performative effect, we prove the convexity of the performative risk under a new set of assumptions. Notably, we do not limit the strength of performative effects but rather their direction, requiring only that classification becomes harder when deploying more accurate models. In this case, we also establish a connection with adversarially robust classification by reformulating the minimization of the performative risk as a min-max variational problem. Finally, we illustrate our approach on synthetic and real datasets.

Paper Structure

This paper contains 22 sections, 4 theorems, 33 equations, 3 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

Under Assumption assum:pf, the gradient of the performative risk is given by where $\nabla_z\ell(z;\theta)$ and $\nabla_{\theta} \ell(z;\theta)$ denote respectively the gradient with respect to the first and the second parameter of the loss, and $\operatorname{J}_\theta^T \varphi(u; \theta)$ is the transpose of the Jacobian with respect to $\theta$.

Figures (3)

  • Figure 1: Profile risk for classifying two Gaussian centered in $\mu_0 = (0,0)$ and $\mu_1=(-1, 1)$ with quadratic loss and various values of $\lambda$ for the diagonal coefficients of $\Pi$. The performative risk remains convex as long as $\Pi$ is positive semidefinite i.e. $\lambda\geq 0$, and becomes non-convex whenever some of the $\lambda_i$ are negative.
  • Figure 2: (a) Logistic regression to classify two Gaussian distributions centered in $(0,0)$ and $(-1, -1)$ and different magnitudes of performative effects $\gamma$. We report the accuracy for three different magnitudes of the performative effects, from no performative effect ($\gamma = 0$) to a strong one ($\gamma=1$). (b) we report the position of the parameter $\theta$ in its 2D-space, starting from $(0,0)$ and following different paths depending on the algorithm. (c) Accuracy of a classification with quadratic loss on two Gaussian distributions of dimension $7$ with various levels of variance $\sigma$ of the distributions. (d) Same experiments but using the learnt $\Pi$ for RPPerfGD. (e) In this case, distance between the true matrix $\Pi$ and the estimated version. Note that in RGD and RRGD the estimation of $\Pi$ is not used in the algorithm. (f) Logistic regression for the Housing dataset with various magnitude of performative shift $\lambda$ on the coordinates $0$, $4$ and $6$. Accuracy is averaged over $20$ runs.
  • Figure 3: (a) Learning a logistic regression between two Gaussian distributions centered in $(0,0)$ and $(-1, 1)$ and different magnitude of performative effects $\gamma$. (b) Accuracy of a classification with quadratic loss on two Gaussian of dimension $7$ with various level of noise $\sigma$

Theorems & Definitions (14)

  • Theorem 1: Performative Risk Gradient
  • proof
  • Definition 1: Reparameterization-based Performative Gradient Estimator
  • Example 1: Shift Operator
  • Example 2: Perfomative Gaussian Mean estimation
  • Remark 1: Localization of the Performative Shift
  • Example 3: Pricing Model
  • Theorem 2: Convexity of Classification Performative Risk
  • Remark 2: Generalization to Performative Effect Affecting Both Classes
  • Theorem 3: Variational Formulation of the performative Risk
  • ...and 4 more