Learning Optimal Contracts: How to Exploit Small Action Spaces

Francesco Bacchiocchi; Matteo Castiglioni; Alberto Marchesi; Nicola Gatti

Learning Optimal Contracts: How to Exploit Small Action Spaces

Francesco Bacchiocchi, Matteo Castiglioni, Alberto Marchesi, Nicola Gatti

TL;DR

The paper tackles learning optimal contracts in repeated hidden-action principal-agent problems with small action spaces, where the agent’s actions are unobserved and outcomes are the feedback signal. It introduces the Discover-and-Cover framework, leveraging meta-actions to partition best-response regions and a trio of subroutines (Action-Oracle, Try-Cover, Find-Contract) to learn a bounded contract efficiently. The main theoretical gains are a polynomial-round sample complexity when $n$ is fixed and a cumulative regret bound that scales as $ ilde{O}(T^{4/5})$, thereby resolving an open problem for general settings with many outcomes. The construction also yields a no-regret online learning algorithm, with practical implications for sequential contract design in digital economics and Stackelberg-like settings where actions are unobserved.

Abstract

We study principal-agent problems in which a principal commits to an outcome-dependent payment scheme -- called contract -- in order to induce an agent to take a costly, unobservable action leading to favorable outcomes. We consider a generalization of the classical (single-round) version of the problem in which the principal interacts with the agent by committing to contracts over multiple rounds. The principal has no information about the agent, and they have to learn an optimal contract by only observing the outcome realized at each round. We focus on settings in which the size of the agent's action space is small. We design an algorithm that learns an approximately-optimal contract with high probability in a number of rounds polynomial in the size of the outcome space, when the number of actions is constant. Our algorithm solves an open problem by Zhu et al.[2022]. Moreover, it can also be employed to provide a $\tilde{\mathcal{O}}(T^{4/5})$ regret bound in the related online learning setting in which the principal aims at maximizing their cumulative utility, thus considerably improving previously-known regret bounds.

Learning Optimal Contracts: How to Exploit Small Action Spaces

TL;DR

is fixed and a cumulative regret bound that scales as

, thereby resolving an open problem for general settings with many outcomes. The construction also yields a no-regret online learning algorithm, with practical implications for sequential contract design in digital economics and Stackelberg-like settings where actions are unobserved.

Abstract

regret bound in the related online learning setting in which the principal aims at maximizing their cumulative utility, thus considerably improving previously-known regret bounds.

Paper Structure (23 sections, 28 theorems, 53 equations, 5 figures, 6 algorithms)

This paper contains 23 sections, 28 theorems, 53 equations, 5 figures, 6 algorithms.

Introduction
Original contributions
Relation to repeated Stackelberg games
Preliminaries on hidden-action principal-agent problems
Learning optimal contracts
The Discover-and-Cover algorithm
Action-Oracle
Try-Cover
Find-Contract
Putting it all together
Connection with online learning in principal-agent problems
Related works
Learning in principal-agent problems
Learning in Stackelberg games
Assumptions relaxed compared to Stackelberg games
...and 8 more sections

Key Result

Theorem 1

For any number of rounds $N \in \mathbb{N}$, there is no algorithm that is guaranteed to find a $\kappa$-optimal contract with probability greater than or equal to $1 - \delta$ by using less than $N$ rounds, where $\kappa, \delta > 0$ are some suitable absolute constants.

Figures (5)

Figure : Discover-and-Cover
Figure : Action-Oracle
Figure : Try-Cover
Figure : Find-Contract
Figure : No-regret algorithm

Theorems & Definitions (51)

Theorem 1
Definition 1: Learning an optimal bounded contract
Definition 2: Clean event
Definition 3: Associated actions
Lemma 1
Lemma 2
Definition 4: Cost of a meta-action
Lemma 3
Lemma 4
Lemma 5
...and 41 more

Learning Optimal Contracts: How to Exploit Small Action Spaces

TL;DR

Abstract

Learning Optimal Contracts: How to Exploit Small Action Spaces

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (51)