Table of Contents
Fetching ...

Learning Against Nature: Minimax Regret and the Price of Robustness

Yeon-Koo Che, Longjian Li, Tianling Luo

TL;DR

The paper tackles learning under ambiguity about the data-generating process for a binary state, proposing a minimax-regret updating rule that yields ex-ante robust beliefs. It shows that Nature degrades signal precision at the $1/\sqrt{n}$ rate, making learning nontrivial but incomplete and causing the robust DM to under-infer in large samples. In the special MSE case with binary signals, the authors derive finite-sample equilibria and a limit Gaussian-shift game with a two-point mix, and they prove convergence of finite games to the limit game. Extending to general Bregman divergences, they demonstrate that the $1/\sqrt{n}$ ambiguity rate is fundamental and characterize the price of robustness when a fixed informative DGP is true. Overall, the work provides a decision-theoretic dual to local alternatives in statistics and highlights the nontrivial cost of guarding against worst-case data quality in learning systems.

Abstract

We study how a decision-maker (DM) learns from data of unknown quality to form robust, ''general-purpose'' posterior beliefs. We develop a framework for robust learning and belief formation under a minimax-regret criterion, cast as a zero-sum game: the DM chooses posterior beliefs to minimize ex-ante regret, while an adversarial Nature selects the data-generating process (DGP). We show that, in large samples of $n$ signal draws, Nature optimally induces ambiguity by choosing a process whose precision converges to the uninformative signals at the rate $1/\sqrt{n}$. As a result, learning against the adversarial DGP is nontrivial as well as incomplete: the DM's ex-ante regret remains strictly positive even with an infinite amount of data. However, when the true DGP is fixed and informative (even if only slightly), our DM with a robust updating rule eventually learns the state with enough data. Still, learning occurs at a sub-exponential rate -- quantifying the asymptotic price of robustness -- and it exhibits ''under-inference'' bias. Our framework provides a decision-theoretic dual to the local alternatives method in asymptotic statistics, deriving the characteristic $1/\sqrt{n}$-scaling endogenously from the signal ambiguity.

Learning Against Nature: Minimax Regret and the Price of Robustness

TL;DR

The paper tackles learning under ambiguity about the data-generating process for a binary state, proposing a minimax-regret updating rule that yields ex-ante robust beliefs. It shows that Nature degrades signal precision at the rate, making learning nontrivial but incomplete and causing the robust DM to under-infer in large samples. In the special MSE case with binary signals, the authors derive finite-sample equilibria and a limit Gaussian-shift game with a two-point mix, and they prove convergence of finite games to the limit game. Extending to general Bregman divergences, they demonstrate that the ambiguity rate is fundamental and characterize the price of robustness when a fixed informative DGP is true. Overall, the work provides a decision-theoretic dual to local alternatives in statistics and highlights the nontrivial cost of guarding against worst-case data quality in learning systems.

Abstract

We study how a decision-maker (DM) learns from data of unknown quality to form robust, ''general-purpose'' posterior beliefs. We develop a framework for robust learning and belief formation under a minimax-regret criterion, cast as a zero-sum game: the DM chooses posterior beliefs to minimize ex-ante regret, while an adversarial Nature selects the data-generating process (DGP). We show that, in large samples of signal draws, Nature optimally induces ambiguity by choosing a process whose precision converges to the uninformative signals at the rate . As a result, learning against the adversarial DGP is nontrivial as well as incomplete: the DM's ex-ante regret remains strictly positive even with an infinite amount of data. However, when the true DGP is fixed and informative (even if only slightly), our DM with a robust updating rule eventually learns the state with enough data. Still, learning occurs at a sub-exponential rate -- quantifying the asymptotic price of robustness -- and it exhibits ''under-inference'' bias. Our framework provides a decision-theoretic dual to the local alternatives method in asymptotic statistics, deriving the characteristic -scaling endogenously from the signal ambiguity.
Paper Structure (65 sections, 23 theorems, 191 equations, 5 figures)

This paper contains 65 sections, 23 theorems, 191 equations, 5 figures.

Key Result

Proposition 1

When $n=1$, in the equilibrium Nature chooses $\pi$ that randomizes between $\frac{1}{2}$ and $1$ with equal probability. The DM chooses $(a_0,a_1)=(\frac{1}{4},\frac{3}{4})$.

Figures (5)

  • Figure 1: Illustration for $n=1$ case
  • Figure 2: Simulation for $3\le n\le 5$
  • Figure 3: Trend of $\pi_n^*, R(\bm a^*_n, \pi_n^*), w, (a^*_n(k))_{0\le k\le n}$ for different $n$
  • Figure :
  • Figure :

Theorems & Definitions (54)

  • Proposition 1
  • proof
  • Theorem 1
  • proof
  • Theorem 2
  • Theorem 3: Convergence of Equilibria
  • proof
  • Corollary 1: Failure of Law of Large Numbers against adversarial nature
  • proof
  • Theorem 4
  • ...and 44 more