Table of Contents
Fetching ...

Plug-in Performative Optimization

Licong Lin, Tijana Zrnic

TL;DR

This work studies a general protocol for making use of possibly misspecified models in performative prediction, called plug-in performative optimization, and shows this solution can be far superior to model-agnostic strategies, as long as the misspecification is not too extreme.

Abstract

When predictions are performative, the choice of which predictor to deploy influences the distribution of future observations. The overarching goal in learning under performativity is to find a predictor that has low \emph{performative risk}, that is, good performance on its induced distribution. One family of solutions for optimizing the performative risk, including bandits and other derivative-free methods, is agnostic to any structure in the performative feedback, leading to exceedingly slow convergence rates. A complementary family of solutions makes use of explicit \emph{models} for the feedback, such as best-response models in strategic classification, enabling faster rates. However, these rates critically rely on the feedback model being correct. In this work we study a general protocol for making use of possibly misspecified models in performative prediction, called \emph{plug-in performative optimization}. We show this solution can be far superior to model-agnostic strategies, as long as the misspecification is not too extreme. Our results support the hypothesis that models, even if misspecified, can indeed help with learning in performative settings.

Plug-in Performative Optimization

TL;DR

This work studies a general protocol for making use of possibly misspecified models in performative prediction, called plug-in performative optimization, and shows this solution can be far superior to model-agnostic strategies, as long as the misspecification is not too extreme.

Abstract

When predictions are performative, the choice of which predictor to deploy influences the distribution of future observations. The overarching goal in learning under performativity is to find a predictor that has low \emph{performative risk}, that is, good performance on its induced distribution. One family of solutions for optimizing the performative risk, including bandits and other derivative-free methods, is agnostic to any structure in the performative feedback, leading to exceedingly slow convergence rates. A complementary family of solutions makes use of explicit \emph{models} for the feedback, such as best-response models in strategic classification, enabling faster rates. However, these rates critically rely on the feedback model being correct. In this work we study a general protocol for making use of possibly misspecified models in performative prediction, called \emph{plug-in performative optimization}. We show this solution can be far superior to model-agnostic strategies, as long as the misspecification is not too extreme. Our results support the hypothesis that models, even if misspecified, can indeed help with learning in performative settings.
Paper Structure (49 sections, 7 theorems, 79 equations, 4 figures, 1 algorithm)

This paper contains 49 sections, 7 theorems, 79 equations, 4 figures, 1 algorithm.

Key Result

Theorem 1

The excess risk of the plug-in performative optimum is bounded by: for some universal constant $c>0$.

Figures (4)

  • Figure 1: (Location family) Excess risk of plug-in performative optimization, DFO, greedy SGD, and PerfGD with $\pm 1$ standard deviation on a logarithmic scale.
  • Figure 2: (Strategic regression) Excess risk and accuracy of plug-in performative optimization, DFO, and greedy SGD, with $\pm 1$ standard deviation on a logarithmic scale.
  • Figure 3: Excess risk (top) and accuracy (bottom) versus $n$ for plug-in performative optimization, the DFO algorithm, and greedy SGD, with a changed value of $\tilde{\beta}=1$. We display the $\pm 1$ standard deviation, logarithmically scaled. The takeaways are largely the same as in Figure \ref{['fig:strat_regr_compare_1']}.
  • Figure 4: Excess risk (top) and accuracy (bottom) versus $n$ for plug-in performative optimization, the DFO algorithm, and greedy SGD, on the credit data set. We display the $\pm 1$ standard deviation, logarithmically scaled.

Theorems & Definitions (22)

  • Theorem 1: Informal
  • Example 1: Biased coin flip
  • Theorem 2
  • Remark 1
  • Definition 1: Misspecification
  • Definition 2: Smoothness
  • Corollary 1
  • Example 2: Mixture model
  • Example 3: Self-fulfilling prophecy
  • Example 4: "Typically" well-specified model
  • ...and 12 more