Table of Contents
Fetching ...

Going from a Representative Agent to Counterfactuals in Combinatorial Choice

Yanqiu Ruan, Karthyek Murthy, Karthik Natarajan

TL;DR

This work addresses counterfactual prediction for decision data arising from combinatorial choices on binary polytopes by introducing the Separable Representative Agent Model (S-RAM), a nonparametric framework with separable convex perturbations. The key theoretical contribution is an exact, polynomial-time linear-programming characterization of SRAM representability over $0$-$1$ polytopes, enabling a practical consistency check via a lifted constraint set. Building on this, the authors develop a robust prediction pipeline that, when data are SRAM-consistent, computes worst- and best-case predictions for unseen polytopes and, when not, yields a compact MILP-based best-fit SRAM estimate. Extensive synthetic experiments on longest-path, shortest-path, and assignment problems demonstrate strong predictive accuracy, robustness to misspecification, and efficient solvability, highlighting SRAM’s ability to support counterfactual analysis in broad combinatorial environments.

Abstract

We study decision-making problems where data comprises points from a collection of binary polytopes, capturing aggregate information stemming from various combinatorial selection environments. We propose a nonparametric approach for counterfactual inference in this setting based on a representative agent model, where the available data is viewed as arising from maximizing separable concave utility functions over the respective binary polytopes. Our first contribution is to precisely characterize the selection probabilities representable under this model and show that verifying the consistency of any given aggregated selection dataset reduces to solving a polynomial-sized linear program. Building on this characterization, we develop a nonparametric method for counterfactual prediction. When data is inconsistent with the model, finding a best-fitting approximation for prediction reduces to solving a compact mixed-integer convex program. Numerical experiments based on synthetic data demonstrate the method's flexibility, predictive accuracy, and strong representational power even under model misspecification.

Going from a Representative Agent to Counterfactuals in Combinatorial Choice

TL;DR

This work addresses counterfactual prediction for decision data arising from combinatorial choices on binary polytopes by introducing the Separable Representative Agent Model (S-RAM), a nonparametric framework with separable convex perturbations. The key theoretical contribution is an exact, polynomial-time linear-programming characterization of SRAM representability over - polytopes, enabling a practical consistency check via a lifted constraint set. Building on this, the authors develop a robust prediction pipeline that, when data are SRAM-consistent, computes worst- and best-case predictions for unseen polytopes and, when not, yields a compact MILP-based best-fit SRAM estimate. Extensive synthetic experiments on longest-path, shortest-path, and assignment problems demonstrate strong predictive accuracy, robustness to misspecification, and efficient solvability, highlighting SRAM’s ability to support counterfactual analysis in broad combinatorial environments.

Abstract

We study decision-making problems where data comprises points from a collection of binary polytopes, capturing aggregate information stemming from various combinatorial selection environments. We propose a nonparametric approach for counterfactual inference in this setting based on a representative agent model, where the available data is viewed as arising from maximizing separable concave utility functions over the respective binary polytopes. Our first contribution is to precisely characterize the selection probabilities representable under this model and show that verifying the consistency of any given aggregated selection dataset reduces to solving a polynomial-sized linear program. Building on this characterization, we develop a nonparametric method for counterfactual prediction. When data is inconsistent with the model, finding a best-fitting approximation for prediction reduces to solving a compact mixed-integer convex program. Numerical experiments based on synthetic data demonstrate the method's flexibility, predictive accuracy, and strong representational power even under model misspecification.

Paper Structure

This paper contains 17 sections, 6 theorems, 40 equations, 3 figures, 1 table.

Key Result

lemma 1

Suppose $\mathcal{X} \subseteq \{0,1\}^n,$ the random variables $\{\tilde{\epsilon}_{j}\}_{j\in [n]}$ are absolutely continuous random variables with strictly increasing marginal distribution $\{F_j(\cdot)\}_{j \in [n]}$ on their respective supports, and $\mathbb{E}|\tilde{\epsilon}_{j}| < \infty$.

Figures (3)

  • Figure 1: A new framework for counterfactual prediction with a summary of main contributions
  • Figure 2: Prediction accuracy of nonparametric S-RAM with underlying parametric S-RAM data. Blue dots are ground truths. Red ranges are prediction intervals for the predicted flow using nonparametric S-RAM in path problems. Red dots are optimistic estimates of total welfare using nonparametric S-RAM in assignment problems.
  • Figure 3: An illustration of the construction of the functions $c_j^\prime$ when: (a) $p_j^{K_j} <1$ and (b) $p_j^1 > 0$.

Theorems & Definitions (11)

  • lemma 1: natarajan2009persistency
  • theorem 1: S-RAM Characterization with $0-1$ Polytopes
  • proposition 1
  • proposition 2
  • proposition 3
  • proposition 4
  • proof
  • proof
  • proof
  • proof
  • ...and 1 more