Table of Contents
Fetching ...

Probably Approximately Correct Constrained Learning

Luiz F. O. Chamon, Alejandro Ribeiro

TL;DR

This work extends PAC learning to constrained settings by introducing PAC constrained (PACC) learnability, showing that any PAC learnable class is also PACC learnable via a constrained ERM rule and that feasibility can be enforced without increasing statistical hardness. To address the practical challenge of non-convex constrained ERMs, the authors derive a representation-independent empirical dual formulation and a primal–dual algorithm that attains a near-PACC solution with quantifiable approximation error tied to the parametrization richness and constraint difficulty. They provide a rigorous generalization analysis and demonstrate the approach on fairness and robustness problems, illustrating how dual variables offer insight into constraint tightness and bias interactions. The methodology yields a principled, scalable framework for learning under requirements in high-stakes domains, with potential extensions to reinforcement learning and non-convex loss settings.

Abstract

As learning solutions reach critical applications in social, industrial, and medical domains, the need to curtail their behavior has become paramount. There is now ample evidence that without explicit tailoring, learning can lead to biased, unsafe, and prejudiced solutions. To tackle these problems, we develop a generalization theory of constrained learning based on the probably approximately correct (PAC) learning framework. In particular, we show that imposing requirements does not make a learning problem harder in the sense that any PAC learnable class is also PAC constrained learnable using a constrained counterpart of the empirical risk minimization (ERM) rule. For typical parametrized models, however, this learner involves solving a constrained non-convex optimization program for which even obtaining a feasible solution is challenging. To overcome this issue, we prove that under mild conditions the empirical dual problem of constrained learning is also a PAC constrained learner that now leads to a practical constrained learning algorithm based solely on solving unconstrained problems. We analyze the generalization properties of this solution and use it to illustrate how constrained learning can address problems in fair and robust classification.

Probably Approximately Correct Constrained Learning

TL;DR

This work extends PAC learning to constrained settings by introducing PAC constrained (PACC) learnability, showing that any PAC learnable class is also PACC learnable via a constrained ERM rule and that feasibility can be enforced without increasing statistical hardness. To address the practical challenge of non-convex constrained ERMs, the authors derive a representation-independent empirical dual formulation and a primal–dual algorithm that attains a near-PACC solution with quantifiable approximation error tied to the parametrization richness and constraint difficulty. They provide a rigorous generalization analysis and demonstrate the approach on fairness and robustness problems, illustrating how dual variables offer insight into constraint tightness and bias interactions. The methodology yields a principled, scalable framework for learning under requirements in high-stakes domains, with potential extensions to reinforcement learning and non-convex loss settings.

Abstract

As learning solutions reach critical applications in social, industrial, and medical domains, the need to curtail their behavior has become paramount. There is now ample evidence that without explicit tailoring, learning can lead to biased, unsafe, and prejudiced solutions. To tackle these problems, we develop a generalization theory of constrained learning based on the probably approximately correct (PAC) learning framework. In particular, we show that imposing requirements does not make a learning problem harder in the sense that any PAC learnable class is also PAC constrained learnable using a constrained counterpart of the empirical risk minimization (ERM) rule. For typical parametrized models, however, this learner involves solving a constrained non-convex optimization program for which even obtaining a feasible solution is challenging. To overcome this issue, we prove that under mild conditions the empirical dual problem of constrained learning is also a PAC constrained learner that now leads to a practical constrained learning algorithm based solely on solving unconstrained problems. We analyze the generalization properties of this solution and use it to illustrate how constrained learning can address problems in fair and robust classification.

Paper Structure

This paper contains 27 sections, 8 theorems, 71 equations, 7 figures, 3 tables, 1 algorithm.

Key Result

Theorem 1

Let the $\ell_i$, $i = 0,\dots,m+q$, be bounded on $\mathcal{X}$. The hypothesis class $\mathcal{H}$ is PACC learnable if and only if it is PAC learnable and P:ecrm is a PACC learner of $\mathcal{H}$. Explicitly, let $d_\mathcal{H} < \infty$ be the VC dimension of $\mathcal{H}$. If $N_i \geq C \zeta then any solution $\hat{\phi}^\star$ of P:ecrm is a PACC solution of P:csl.

Figures (7)

  • Figure 1: Fair classification (Adult dataset): (a) classifier sensitivity and (b) prevalence of different groups among the $20\%$ training set examples with largest dual variables.
  • Figure 2: Robust constrained learning (FMNIST): (a) Accuracy of classifiers under the PGD attack for different perturbation magnitudes and (b) distribution of $\varepsilon$ used during training.
  • Figure 3: Classifier sensitivity on the Adult test set.
  • Figure 4: Dual variable analysis for Adult dataset: (a) distribution of the dual variables values and (b) prevalence of different groups among the $20\%$ training set examples with largest dual variables.
  • Figure 5: Dual variables of different counterfactual constraints for the COMPAS dataset.
  • ...and 2 more figures

Theorems & Definitions (19)

  • Definition 1: PAC learnability
  • Definition 2: PACC learnability
  • Theorem 1
  • proof
  • Definition 3: Near-PACC learnability
  • Theorem 2
  • proof
  • Remark 1
  • Theorem 3
  • proof
  • ...and 9 more