Dueling over Multiple Pieces of Dessert

Simina Brânzei; Reed Phillips

Dueling over Multiple Pieces of Dessert

Simina Brânzei, Reed Phillips

TL;DR

This work studies repeated two-player cake-cutting where Alice, as the Stackelberg leader, partitions the cake and Bob chooses a piece each round. It reveals sharp learnability boundaries: with fully measurable cuts, Alice cannot achieve strongly sublinear regret against even a myopic Bob; but when restricting to at most $k$ cuts, the learning landscape becomes tractable and yields distinct regret regimes depending on Bob’s suspected learning rate (public vs private) and his strategic sophistication (myopic vs non-myopic). The results provide tight (up to polylog factors) upper and lower bounds for $k=2$ and $k\ge3$ cuts, including regimes with public $\alpha$-regret budgets and adaptive strategies, and extend to a Robertson-Webb query-model corollary detailing $\varepsilon$-Stackelberg allocations via $O(k/\varepsilon)$ queries. Collectively, the findings illuminate the fundamental trade-offs between partitioning flexibility and learnability in repeated Stackelberg-style cake cutting, with implications for online learning in strategic division problems and for query-complexity in classical cake-cutting models.

Abstract

We study the dynamics of repeated fair division between two players, Alice and Bob, where Alice partitions a cake into two subsets and Bob chooses his preferred one over $T$ rounds. Alice aims to minimize her regret relative to the Stackelberg value -- the maximum utility she could achieve if she knew Bob's private valuation. We show that if Alice uses arbitrary measurable partitions, achieving strongly sublinear regret is impossible; she suffers a regret of $Ω\Bigl(\frac{T}{\log^2 T}\Bigr)$ regret even against a myopic Bob. However, when Alice uses at most $k$ cuts, the learning landscape becomes tractable. We analyze Alice's performance based on her knowledge of Bob's strategic sophistication (his regret budget). When Bob's learning rate is public, we establish a hierarchy of polynomial regret bounds determined by $k$ and Bob's regret budget. In contrast, when this learning rate is private, Alice can universally guarantee $O\Bigl(\frac{T}{\log T}\Bigr)$ regret, but any attempt to secure a polynomial rate $O(T^β)$ (for $β< 1$) leaves her vulnerable to incurring strictly linear regret against some Bob. Finally, as a corollary of our online learning dynamics, we characterize the randomized query complexity of finding approximate Stackelberg allocations with a constant number of cuts in the Robertson-Webb model.

Dueling over Multiple Pieces of Dessert

TL;DR

cuts, the learning landscape becomes tractable and yields distinct regret regimes depending on Bob’s suspected learning rate (public vs private) and his strategic sophistication (myopic vs non-myopic). The results provide tight (up to polylog factors) upper and lower bounds for

and

cuts, including regimes with public

-regret budgets and adaptive strategies, and extend to a Robertson-Webb query-model corollary detailing

-Stackelberg allocations via

queries. Collectively, the findings illuminate the fundamental trade-offs between partitioning flexibility and learnability in repeated Stackelberg-style cake cutting, with implications for online learning in strategic division problems and for query-complexity in classical cake-cutting models.

Abstract

We study the dynamics of repeated fair division between two players, Alice and Bob, where Alice partitions a cake into two subsets and Bob chooses his preferred one over

rounds. Alice aims to minimize her regret relative to the Stackelberg value -- the maximum utility she could achieve if she knew Bob's private valuation. We show that if Alice uses arbitrary measurable partitions, achieving strongly sublinear regret is impossible; she suffers a regret of

regret even against a myopic Bob. However, when Alice uses at most

cuts, the learning landscape becomes tractable. We analyze Alice's performance based on her knowledge of Bob's strategic sophistication (his regret budget). When Bob's learning rate is public, we establish a hierarchy of polynomial regret bounds determined by

and Bob's regret budget. In contrast, when this learning rate is private, Alice can universally guarantee

regret, but any attempt to secure a polynomial rate

(for

) leaves her vulnerable to incurring strictly linear regret against some Bob. Finally, as a corollary of our online learning dynamics, we characterize the randomized query complexity of finding approximate Stackelberg allocations with a constant number of cuts in the Robertson-Webb model.

Paper Structure (45 sections, 45 theorems, 95 equations, 6 figures)

This paper contains 45 sections, 45 theorems, 95 equations, 6 figures.

Introduction
Model
Allocations.
Our Results
The Measurable-Cut Game
The $k$-Cut Game
Public Learning Rate
Private Learning Rate
Corollary for the Robertson-Webb Model
Related Work
The cake cutting model.
Existence and computational complexity of envy-free allocations.
Strategic cake cutting.
Repeated fair division.
Learning in repeated Stackelberg games.
...and 30 more sections

Key Result

Theorem 1

Consider the measurable-cut game and suppose Bob is myopic. There exists $T_0 \in \mathbb{N}$ such that for every valuation and deterministic strategy of Alice, there exists a Bob valuation against which Alice's regret is $\Omega(T / (\log T)^2)$ for every horizon $T \geq T_0$.

Figures (6)

Figure 1: A Stackelberg cut for $k=2$. Bob's value density $v_B$ is plotted in blue. Alice's value density (not pictured) is assumed to be uniform: $v_A(x)=1$$\forall x \in [0,1]$. The region between the red dashed cuts has value $1/2$ to Bob and maximizes Alice's value among all such regions; thus, $u_A^*(2) = 5/8$.
Figure 2: An example outcome of the first two rounds of a $2$-cut game. In each round, Alice partitions $[0, 1]$ into two pieces by making at most $2$ cuts. The pieces Bob chose are in blue. In this case, they are the pieces he would prefer under the value density from Figure \ref{['fig:two-cut-Stackelberg']}.
Figure 3: Illustration of an alternating partition for $k=6$. The positive-length intervals of $Z_1$ and $Z_2$ are colored blue and red, respectively. Setting some cut points equal to each other, as with $x_2=x_3$ and $x_6=x_7$, allows Alice to effectively make fewer than $k$ cuts if desired.
Figure 4: An illustration of the transformations in Theorem \ref{['thm:measurable-cut-myopic-lower-bound']}, focusing on the mass moved in and out of a single $(L_i, R_i)$ pair. Blue indicates the elements of each of the three sets. The starting $B^j_1$ is an arbitrary piece Alice's strategy could make. From $B^j_1$ to $B^j_2$, an equal amount of mass is removed from $L_i$ and $R_i$ until one of them is empty, in this case $L_i$. Twice this much mass is added to $[1/2, 1]$. From $B^j_2$ to $B^j_3$, mass is moved back into the partially filled $R_i$ to complete it. The resulting $B^j_3$ then contains exactly one set from each $(L_i, R_i)$ pair while still having similar value to $B^j_1$ for both players.
Figure 5: Superimposed plots of the unspiked value density function $\sigma_0^k$ and an example spiked value density $\sigma_{w;z}^k$ for $k=5$. The segments in purple are common to $\sigma_0^k$ and $\sigma_{w;z}^k$. The functions differ only in a small interval centered at $z$ called the spike, where $\sigma_0^k$ maintains its value in red and $\sigma_{w;z}^k$ diverges in blue.
...and 1 more figures

Theorems & Definitions (91)

Theorem 1
Theorem 2: $k \geq 2$ cuts; Myopic Bob
Theorem 3: 2 cuts; Non-Myopic Bob with Public Learning Rate
Theorem 4: $k \geq 3$ cuts; Non-Myopic Bob with Public Learning Rate
Theorem 5: $k \geq 2$ cuts; Non-Myopic Bob with Private Learning Rate
Definition 1
Theorem 6
Lemma 1
Theorem 6
proof : Proof sketch of Theorem \ref{['thm:measurable-cut-myopic-lower-bound']}
...and 81 more

Dueling over Multiple Pieces of Dessert

TL;DR

Abstract

Dueling over Multiple Pieces of Dessert

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (91)