Delegating Data Collection in Decentralized Machine Learning

Nivasini Ananthakrishnan; Stephen Bates; Michael I. Jordan; Nika Haghtalab

Delegating Data Collection in Decentralized Machine Learning

Nivasini Ananthakrishnan, Stephen Bates, Michael I. Jordan, Nika Haghtalab

TL;DR

This work designs optimal and near-optimal contracts that deal with two fundamental information asymmetries that arise in decentralized ML: uncertainty in the assessment of model quality and uncertainty regarding the optimal performance of any model.

Abstract

Motivated by the emergence of decentralized machine learning (ML) ecosystems, we study the delegation of data collection. Taking the field of contract theory as our starting point, we design optimal and near-optimal contracts that deal with two fundamental information asymmetries that arise in decentralized ML: uncertainty in the assessment of model quality and uncertainty regarding the optimal performance of any model. We show that a principal can cope with such asymmetry via simple linear contracts that achieve 1-1/e fraction of the optimal utility. To address the lack of a priori knowledge regarding the optimal performance, we give a convex program that can adaptively and efficiently compute the optimal contract. We also study linear contracts and derive the optimal utility in the more complex setting of multiple interactions.

Delegating Data Collection in Decentralized Machine Learning

TL;DR

Abstract

Paper Structure (34 sections, 10 theorems, 34 equations, 2 figures)

This paper contains 34 sections, 10 theorems, 34 equations, 2 figures.

Introduction
Our results
Single-round of interaction.
Multiple rounds of interaction.
Related work
Model
Optimality of Linear Contracts
Extensions
Optimal contracts for hidden state
Designing contracts against state-learning agents
Multi-round Delegation
The model.
Contracting Protocol.
Utilities.
Test-accuracy-based payments.
...and 19 more sections

Key Result

Proposition 1

For any set of problem parameters $\theta \in [0,1), d, p, \alpha, \beta > 0$, the first-best contract offers payment $\alpha n^*$ when the test accuracy is at least $1 - \theta - d/{n^{*p}}$, where $n^* = (pd/\alpha \beta)^{1/(p+1)}$.

Figures (2)

Figure 1: Variation of utility, information rent and downward distortion magnitude with the gap in optimal error values ($\Delta \theta$). Information rent is the utility the agent makes under the lower optimal error problem. Downward distortion magnitude is how many fewer samples the agent collects compared to the first-best contract under the higher optimal error problem.
Figure 2: Figure \ref{['subfig:typeAwarePoolSep']} plots the utilities of the state-aware, separating, and the pooling contract against $(\Delta \theta)$. Figure \ref{['subfig:typeawareVsTypeLearnVaryK']} again plots the utilities of contracts on the $y$-axis and $\Delta \theta$ on the $x$-axis. It plots the state-aware utility and the utilities of state-learning contracts of different levels $k$ of agent's testing efficiency. Figure \ref{['subfig:typeLearnApprox']} plots the worst-case sub-optimality of state-learning contracts against $k$. The sub-optimality is the ratio of the state-learning contract's utility and the state-aware utility. The worst-case sub-optimality is the largest sub-optimality over all $\Delta \theta \in [0,0.5]$.

Theorems & Definitions (24)

Remark 1: VC dimension bound
Remark 2: Linear regression model
Proposition 1: First-best contract
Proposition 2: Linear contracts are approximately optimal when optimal error is known
proof : Proof sketch of Proposition \ref{['thm:LinContractApprox']}
Theorem 1: Main result
Definition 1: Insignificance of hidden action at level $\epsilon$
Theorem 2: Sample complexity for insignificant hidden action
Definition 2: $\mathcal{H}$-regret
Proposition 3
...and 14 more

Delegating Data Collection in Decentralized Machine Learning

TL;DR

Abstract

Delegating Data Collection in Decentralized Machine Learning

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (24)