Emergent specialization from participation dynamics and multi-learner retraining

Sarah Dean; Mihaela Curmei; Lillian J. Ratliff; Jamie Morgenstern; Maryam Fazel

Emergent specialization from participation dynamics and multi-learner retraining

Sarah Dean, Mihaela Curmei, Lillian J. Ratliff, Jamie Morgenstern, Maryam Fazel

TL;DR

This work analyzes endogenously shifting participation in a multi-learner setting where subpopulations allocate themselves to learners and learners retrain their models. By formalizing risk-reducing dynamics that couple multiplicative weights over allocations with gradient-based retraining, the authors show that asymptotically stable equilibria are segmented and that the total risk $\mathcal{R}^{\mathsf{total}}(\alpha,\Theta)$ serves as a Lyapunov-like potential guiding convergence. They prove that, under risk-minimizing-in-the-limit dynamics, equilibria are isolated local minima of the total risk, and provide conditions distinguishing segmented and balanced equilibria, along with their stability properties. The paper also links social welfare to the total risk, showing welfare improves with more learners and illustrating the phenomena with synthetic and semi-synthetic experiments, thereby offering insights into the design of multi-learner systems and potential segmentation/fairness trade-offs.

Abstract

Numerous online services are data-driven: the behavior of users affects the system's parameters, and the system's parameters affect the users' experience of the service, which in turn affects the way users may interact with the system. For example, people may choose to use a service only for tasks that already works well, or they may choose to switch to a different service. These adaptations influence the ability of a system to learn about a population of users and tasks in order to improve its performance broadly. In this work, we analyze a class of such dynamics -- where users allocate their participation amongst services to reduce the individual risk they experience, and services update their model parameters to reduce the service's risk on their current user population. We refer to these dynamics as \emph{risk-reducing}, which cover a broad class of common model updates including gradient descent and multiplicative weights. For this general class of dynamics, we show that asymptotically stable equilibria are always segmented, with sub-populations allocated to a single learner. Under mild assumptions, the utilitarian social optimum is a stable equilibrium. In contrast to previous work, which shows that repeated risk minimization can result in (Hashimoto et al., 2018; Miller et al., 2021), we find that repeated myopic updates with multiple learners lead to better outcomes. We illustrate the phenomena via a simulated example initialized from real data.

Emergent specialization from participation dynamics and multi-learner retraining

TL;DR

serves as a Lyapunov-like potential guiding convergence. They prove that, under risk-minimizing-in-the-limit dynamics, equilibria are isolated local minima of the total risk, and provide conditions distinguishing segmented and balanced equilibria, along with their stability properties. The paper also links social welfare to the total risk, showing welfare improves with more learners and illustrating the phenomena with synthetic and semi-synthetic experiments, thereby offering insights into the design of multi-learner systems and potential segmentation/fairness trade-offs.

Abstract

Paper Structure (29 sections, 24 theorems, 50 equations, 8 figures)

This paper contains 29 sections, 24 theorems, 50 equations, 8 figures.

Introduction
Related Work
Framework and Setting
Decision dynamics of learners and subpopulations
Equilibria and stability
Main Results
Total Risk Reduction
Segmented and Balanced Equilibria
Social Welfare for Segmented Populations
Simulations
Discussion
Motivating Examples
Social Media
Music Streaming
Personalized health
...and 14 more sections

Key Result

Proposition 3.3

A subpopulation $i$ updating their participation with multiplicative weights is risk minimizing in the limit if $\gamma>0$ and $\alpha^{0}_{ij}>0$$\forall j$. A learner updating its parameter with gradient descent is risk minimizing in the limit when the risk functions $\mathcal{R}_i(\theta)$ are $L

Figures (8)

Figure 1: $n=3$ subpopulations ($\square$, $\ocircle$, $\triangle$) select among $m=2$ learners (red, blue) based on classification accuracy with respect to label (solid, hollow). Parameters $\Theta=(\theta_1,\theta_2)$ (decision lines) update in response to subpopulations participation $\alpha_{i,j}$. At the current state, the circle subpopulation will shift participation towards blue learner.
Figure 2: An example arising from least-squares linear regression with $n=3$ subpopulations and $m=2$ learners. Left: The distribution of $z=(x,y)$, colored by subpopulation. Middle: The subpopulation risks $\mathcal{R}_i(\theta)$ arising from least-squares linear regression $\ell(\theta; z) = (y-\theta x)^2$. Right: A visualization of the non-convex total risk as a function of learner parameters, via the partial minimization over subpopulation allocation: $\min_{\alpha} \sum_{i=1}^3 \sum_{j=1}^2 \alpha_{ij} \mathcal{R}_i(\theta_j) = \sum_{i=1}^3 \min\{\mathcal{R}_i(\theta_1), \mathcal{R}_i(\theta_2)\}$.
Figure 3: A summary of our main results on equilibria classification for a given participation $\alpha_0$ and model parameters $\Theta_0$. These results hold for dynamics which are risk minimizing in the limit and loss functions that are convex.
Figure 4: Synthetic settings: Figure (a) illustrates a setting with 3 subpopulations and 2 learners. The dsolid lines correspond to the risk trajectory for the unstable balanced equilibrium at initialization. Dotted and dashed lines illustrate risk trajectories under three different slight perturbations from the initialization. In Figure (b), the left plot illustrates the reduction in total risk over time. The dashed blue lines indicate when a new learner joins. The right plot shows the equilibrium-risk for a subset of the subpopulations as the number of learners increases.
Figure 5: Empirical subpopulations from Census data: Figure (a) displays the relative risk with respect to the best achievable risk for the subpopulation over time. Figure (b) illustrates how allocations initialized near $(1/3,1/3, 1/3)$ converge to a split market equilibrium.
...and 3 more figures

Theorems & Definitions (61)

Definition 3.1: Reducing and Minimizing Dynamics
Example 3.2: Semi-static participation
Example 3.3: Full risk minimization
Proposition 3.3
Definition 3.4: Equilibrium
Definition 3.5: Stable Equilibrium
Definition 4.1: Total Risk
Proposition 4.1
proof : Proof Sketch
Theorem 4.2
...and 51 more

Emergent specialization from participation dynamics and multi-learner retraining

TL;DR

Abstract

Emergent specialization from participation dynamics and multi-learner retraining

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (61)