Table of Contents
Fetching ...

Doubly Adaptive Social Learning

Marco Carpentiero, Virginia Bordignon, Vincenzo Matta, Ali H. Sayed

TL;DR

The paper tackles fully online social learning where both the true state and the underlying models drift over time. It introduces the doubly adaptive social learning (A^2SL) framework, which couples SGD-based model learning with an adaptive Bayesian belief update, governed by training, prior, and prediction adaptation parameters $\eta$, $\tilde{\eta}$, and $\delta$. Under a global identifiability condition, A^2SL achieves consistent learning, with transient error decaying exponentially and steady-state error bounded by $O(\delta)+O(\eta)+O(\tilde{\eta})$, illustrating the adaptation–speed trade-off. The authors validate the approach on synthetic data and a CIFAR-10 based distributed online classification task, showing robust tracking of both hypothesis and model drifts and superior online performance compared to offline/adaptive baselines.

Abstract

In social learning, a network of agents assigns probability scores (beliefs) to some hypotheses of interest, which rule the generation of local streaming data observed by each agent. Belief formation takes place by means of an iterative two-step procedure where: i) the agents update locally their beliefs by using some likelihood model; and ii) the updated beliefs are combined with the beliefs of the neighboring agents, using a pooling rule. This procedure can fail to perform well in the presence of dynamic drifts, leading the agents to incorrect decision making. Here, we focus on the fully online setting where both the true hypothesis and the likelihood models can change over time. We propose the doubly adaptive social learning ($\text{A}^2\text{SL}$) strategy, which infuses social learning with the necessary adaptation capabilities. This goal is achieved by exploiting two adaptation stages: i) a stochastic gradient descent update to learn and track the drifts in the decision model; ii) and an adaptive belief update to track the true hypothesis changing over time. These stages are controlled by two adaptation parameters that govern the evolution of the error probability for each agent. We show that all agents learn consistently for sufficiently small adaptation parameters, in the sense that they ultimately place all their belief mass on the true hypothesis. In particular, the probability of choosing the wrong hypothesis converges to values on the order of the adaptation parameters. The theoretical analysis is illustrated both on synthetic data and by applying the $\text{A}^2\text{SL}$ strategy to a social learning problem in the online setting using real data.

Doubly Adaptive Social Learning

TL;DR

The paper tackles fully online social learning where both the true state and the underlying models drift over time. It introduces the doubly adaptive social learning (A^2SL) framework, which couples SGD-based model learning with an adaptive Bayesian belief update, governed by training, prior, and prediction adaptation parameters , , and . Under a global identifiability condition, A^2SL achieves consistent learning, with transient error decaying exponentially and steady-state error bounded by , illustrating the adaptation–speed trade-off. The authors validate the approach on synthetic data and a CIFAR-10 based distributed online classification task, showing robust tracking of both hypothesis and model drifts and superior online performance compared to offline/adaptive baselines.

Abstract

In social learning, a network of agents assigns probability scores (beliefs) to some hypotheses of interest, which rule the generation of local streaming data observed by each agent. Belief formation takes place by means of an iterative two-step procedure where: i) the agents update locally their beliefs by using some likelihood model; and ii) the updated beliefs are combined with the beliefs of the neighboring agents, using a pooling rule. This procedure can fail to perform well in the presence of dynamic drifts, leading the agents to incorrect decision making. Here, we focus on the fully online setting where both the true hypothesis and the likelihood models can change over time. We propose the doubly adaptive social learning () strategy, which infuses social learning with the necessary adaptation capabilities. This goal is achieved by exploiting two adaptation stages: i) a stochastic gradient descent update to learn and track the drifts in the decision model; ii) and an adaptive belief update to track the true hypothesis changing over time. These stages are controlled by two adaptation parameters that govern the evolution of the error probability for each agent. We show that all agents learn consistently for sufficiently small adaptation parameters, in the sense that they ultimately place all their belief mass on the true hypothesis. In particular, the probability of choosing the wrong hypothesis converges to values on the order of the adaptation parameters. The theoretical analysis is illustrated both on synthetic data and by applying the strategy to a social learning problem in the online setting using real data.

Paper Structure

This paper contains 17 sections, 2 theorems, 90 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Assume positive deterministic initial beliefs $\mu_{k,0}(\theta)$ for all the agents and all the hypotheses. For $k=1,2,\ldots,K$, let and where $\eta^{o}$ and $\widetilde{\eta}^{o}$ are the smallest roots of the equationsUnder condition eq:deltaCond both equations have two positive roots. Then, the SGD recursions eq:trainStep1 and eq:trainStep2 estimate $w_k^{o}$ and $u_k^{o}$, respectively, by

Figures (3)

  • Figure 1: Left. Error probability of agent $1$ for different values of the adaptation parameters. The regularization parameters are set as $\rho = 0.05$, $\widetilde{\rho} = 0.05$, and $q_k^{\rm{pr}} = 0.8$ and $q_k^{\rm{tr}} = 0.7$ for all $k$. The limit value $\beta_{\rm{net}}(\theta)$ to verify the global identifiability condition was computed by evaluating offline the optimal parameters $w_k^{o}$ and $u_k^{o}$ using a stochastic gradient descent with decaying step-size and batch-size equal to $2000$. The network topology is shown in the inset plot of the right panel (all agents have self-loop not shown for ease of illustration). The combination matrix is obtained through the uniform-averaging rule Sayedsayednewbooks and can be verified to be primitive. Right. Error probability of agent $1$ in steady-state conditions for several triplets of adaptation parameters, where the values of $\eta$ and $\widetilde{\eta}$ are uniformly spaced in the interval $[0.005, 0.05]$ and the values of $\delta$ are uniformly spaced in the interval $[0.001, 0.01]$. All curves are estimated by means of $10^3$ Monte Carlo runs. For all plots, similar behavior is observed for the other agents.
  • Figure 2: Left. Network topology used in the example from Sec. \ref{['sec:exp2']}. All agents have a self-loop not shown for ease of illustration. Right. Example of image patches (from a truck picture) assigned to agents $1,2,\ldots,9$.
  • Figure 3: Social learning problem over the CIFAR-10 data set cifar10, as illustrated in Sec. \ref{['sec:exp2']}. Top. SML behavior and beliefs evolution of agent $1$ (similar plots for the remaining agents). Bottom. $\mathop{\mathrm{\textnormal{A}\!\!\!\!\space^2\textnormal{SL}}}\limits$ behavior and beliefs evolution of agent $1$ (similar plots for the remaining agents).

Theorems & Definitions (2)

  • Theorem 1: $\mathop{\mathrm{\textnormal{A}\!\!\!\!\space^2\textnormal{SL}}}\limits$ consistency
  • Corollary 1: $\mathop{\mathrm{\textnormal{A}\!\!\!\!\space^2\textnormal{SL}}}\limits$ steady-state performance