Table of Contents
Fetching ...

Collaborative Prediction: Tractable Information Aggregation via Agreement

Natalie Collina, Ira Globus-Harris, Surbhi Goel, Varun Gupta, Aaron Roth, Mirah Shi

TL;DR

This work develops tractable, communication-efficient protocols for collaborative prediction where two parties hold different, possibly illegible features of the same instances. By exchanging only label predictions across a bounded number of rounds, and under a weak learning condition relating the two parties’ hypothesis classes to a joint class on the pooled features, the authors prove information aggregation guarantees and sublinear regret with respect to the joint benchmark. The framework encompasses online adversarial settings and simpler batch settings, and it extends to decision-valued outcomes with calibration-based guarantees; it also lifts to Bayesian, one-shot agreements, yielding distribution-agnostic information-aggregation theorems. The results provide both algorithmic reductions to single-party learning and fundamental lower bounds, highlighting the necessity of interaction, the weak-learning condition, and swap-regret mechanisms for successful aggregation. Practically, these protocols offer a principled approach to human-AI collaboration and multi-modal learning where direct data sharing is restricted or undesirable, with communication complexity independent of data dimensionality and strong theoretical performance guarantees.

Abstract

We give efficient "collaboration protocols" through which two parties, who observe different features about the same instances, can interact to arrive at predictions that are more accurate than either could have obtained on their own. The parties only need to iteratively share and update their own label predictions-without either party ever having to share the actual features that they observe. Our protocols are efficient reductions to the problem of learning on each party's feature space alone, and so can be used even in settings in which each party's feature space is illegible to the other-which arises in models of human/AI interaction and in multi-modal learning. The communication requirements of our protocols are independent of the dimensionality of the data. In an online adversarial setting we show how to give regret bounds on the predictions that the parties arrive at with respect to a class of benchmark policies defined on the joint feature space of the two parties, despite the fact that neither party has access to this joint feature space. We also give simpler algorithms for the same task in the batch setting in which we assume that there is a fixed but unknown data distribution. We generalize our protocols to a decision theoretic setting with high dimensional outcome spaces, where parties communicate only "best response actions." Our theorems give a computationally and statistically tractable generalization of past work on information aggregation amongst Bayesians who share a common and correct prior, as part of a literature studying "agreement" in the style of Aumann's agreement theorem. Our results require no knowledge of (or even the existence of) a prior distribution and are computationally efficient. Nevertheless we show how to lift our theorems back to this classical Bayesian setting, and in doing so, give new information aggregation theorems for Bayesian agreement.

Collaborative Prediction: Tractable Information Aggregation via Agreement

TL;DR

This work develops tractable, communication-efficient protocols for collaborative prediction where two parties hold different, possibly illegible features of the same instances. By exchanging only label predictions across a bounded number of rounds, and under a weak learning condition relating the two parties’ hypothesis classes to a joint class on the pooled features, the authors prove information aggregation guarantees and sublinear regret with respect to the joint benchmark. The framework encompasses online adversarial settings and simpler batch settings, and it extends to decision-valued outcomes with calibration-based guarantees; it also lifts to Bayesian, one-shot agreements, yielding distribution-agnostic information-aggregation theorems. The results provide both algorithmic reductions to single-party learning and fundamental lower bounds, highlighting the necessity of interaction, the weak-learning condition, and swap-regret mechanisms for successful aggregation. Practically, these protocols offer a principled approach to human-AI collaboration and multi-modal learning where direct data sharing is restricted or undesirable, with communication complexity independent of data dimensionality and strong theoretical performance guarantees.

Abstract

We give efficient "collaboration protocols" through which two parties, who observe different features about the same instances, can interact to arrive at predictions that are more accurate than either could have obtained on their own. The parties only need to iteratively share and update their own label predictions-without either party ever having to share the actual features that they observe. Our protocols are efficient reductions to the problem of learning on each party's feature space alone, and so can be used even in settings in which each party's feature space is illegible to the other-which arises in models of human/AI interaction and in multi-modal learning. The communication requirements of our protocols are independent of the dimensionality of the data. In an online adversarial setting we show how to give regret bounds on the predictions that the parties arrive at with respect to a class of benchmark policies defined on the joint feature space of the two parties, despite the fact that neither party has access to this joint feature space. We also give simpler algorithms for the same task in the batch setting in which we assume that there is a fixed but unknown data distribution. We generalize our protocols to a decision theoretic setting with high dimensional outcome spaces, where parties communicate only "best response actions." Our theorems give a computationally and statistically tractable generalization of past work on information aggregation amongst Bayesians who share a common and correct prior, as part of a literature studying "agreement" in the style of Aumann's agreement theorem. Our results require no knowledge of (or even the existence of) a prior distribution and are computationally efficient. Nevertheless we show how to lift our theorems back to this classical Bayesian setting, and in doing so, give new information aggregation theorems for Bayesian agreement.

Paper Structure

This paper contains 81 sections, 66 theorems, 297 equations, 12 algorithms.

Key Result

Theorem 1.5

Fix any triple of hypothesis classes $\mathcal{H}_A,\mathcal{H}_B,$ and $\mathcal{H}_J$. Suppose $\mathcal{H}_A$ and $\mathcal{H}_B$ consist of functions with bounded range and admit efficient online algorithms guaranteeing no external regret with respect to $\mathcal{H}_A$ and $\mathcal{H}_B$ respe In particular, this is true for the classes of norm-bounded linear functions.

Theorems & Definitions (172)

  • Definition 1.1: Predictions have No (External) Regret to $\mathcal{H}_J$
  • Definition 1.2: Swap Regret (Informal Version of Definition \ref{['def:swap']})
  • Definition 1.3: Weak Learning for Regression (Informal Version of Definition \ref{['def:joint-weak']})
  • Definition 1.4: Conversation Swap Regret (Definition \ref{['def:CSR']})
  • Theorem 1.5: Informal statement of Theorem \ref{['thm:sublinear-to-sublinear']}
  • Definition 1.6: Decision Calibration (Definition \ref{['def:decision-calibration']})
  • Definition 1.7: Decision Cross Calibration (Definition \ref{['def:decision-cross-calibration']})
  • Definition 2.1: Conversation Transcript $\pi^{1:T,1:K}$
  • Definition 2.2: Prediction Transcript $\pi^{1:T}$
  • Definition 2.3: Individual Hypothesis Classes $\mathcal{H}_A,\mathcal{H}_B$
  • ...and 162 more