Consensus learning: A novel decentralised ensemble learning paradigm

Horia Magureanu; Naïri Usher

Consensus learning: A novel decentralised ensemble learning paradigm

Horia Magureanu, Naïri Usher

TL;DR

Consensus learning presents a decentralised ensemble framework that couples standard ensemble techniques with probabilistic consensus protocols to deliver privacy-friendly, Byzantine-resilient predictions. By using a two-phase process—local model development followed by a consensus-driven communication phase—the approach preserves data privacy while leveraging the wisdom of crowds. Theoretical results establish lower bounds on accuracy in homogeneous settings and demonstrate convergence to high accuracy with enough base learners, with nuanced behavior in heterogeneous and Byzantine scenarios. Numerical simulations on non-IID data (e.g., FEMNIST) and Beta-distributed base learners corroborate the theoretical insights and highlight robust performance against Byzantine agents, suggesting practical viability for distributed, privacy-conscious AI deployments.

Abstract

The widespread adoption of large-scale machine learning models in recent years highlights the need for distributed computing for efficiency and scalability. This work introduces a novel distributed machine learning paradigm -- \emph{consensus learning} -- which combines classical ensemble methods with consensus protocols deployed in peer-to-peer systems. These algorithms consist of two phases: first, participants develop their models and submit predictions for any new data inputs; second, the individual predictions are used as inputs for a communication phase, which is governed by a consensus protocol. Consensus learning ensures user data privacy, while also inheriting the safety measures against Byzantine attacks from the underlying consensus mechanism. We provide a detailed theoretical analysis for a particular consensus protocol and compare the performance of the consensus learning ensemble with centralised ensemble learning algorithms. The discussion is supplemented by various numerical simulations, which describe the robustness of the algorithms against Byzantine participants.

Consensus learning: A novel decentralised ensemble learning paradigm

TL;DR

Abstract

Paper Structure (44 sections, 14 theorems, 66 equations, 8 figures, 5 tables)

This paper contains 44 sections, 14 theorems, 66 equations, 8 figures, 5 tables.

Introduction
Main contributions
Related works
Organisation
Preliminaries
Ensemble learning
Jury problems
Consensus mechanisms
Snow consensus protocols
Consensus learning
Algorithm description
Summary of key results
I. Homogeneous scenario.
II. Partly heterogeneous scenario.
III. Almost homogeneous scenario with Byzantine nodes.
...and 29 more sections

Key Result

Lemma 2.10

The Slush protocol is symmetric as long as all participants are honest, i.e. for any $b \in \{0, \ldots, n\}$. Moreover, the majority absorption probability satisfies: for $b \geq \lceil*\rceil{{n\over 2}}$. Equality occurs for all $b$ whenever $k = \alpha = 1$.

Figures (8)

Figure 1: The $\mathcal{B}_b$ absorption probability (in blue) for the Slush consensus protocol for $n=61$, $k=10$, $\alpha = 7$, as a function of the number of blue nodes $b$. In orange, the ${b\over n}$ bound is plotted, while in green we have the Chvatal bound \ref{['Chvatal bound']}.
Figure 2: Supervised consensus learning. (a) In the the first stage, participants develop their own models, based on datasets that may overlap. At the end of this phase, each model determines an initial prediction (hollow circles) for any new input. (b) In the communication phase, the initial outputs are exchanged between the participants, which eventually reach consensus on a single output (filled circles).
Figure 3: Solid lines: difference in accuracy between the simple majority rule and the homogeneous Slush algorithm, $\Delta\mathbb{P}$, with $n=101$, against the base learner accuracy $p$. Dashed lines: differences in accuracy between majority and $\delta$-supermajority rules. The shaded area shows the region where the Slush algorithm with $\alpha =7$, $k=10$ outperforms the $\delta=1$ supermajority.
Figure 4: Simulation of Slush consensus learning on FEMNIST dataset for $n=101$ base learners. Green: Distribution of accuracies on validation sets. Orange: Distribution of accuracies on test sets using data from 1, 10 or 100 new users, respectively. The dashed lines correspond to: majority ensemble (black), Slush with $\alpha = 6$ (red) and Slush with local $\alpha$ parameters, i.e. strong confidence (blue).
Figure 5: Accuracies of ensembles built from 101 Random Forest models against number of perfectly malicious models. The honest models are trained on non-iid samples from the FEMNIST dataset and tested on data from 1, 10 and 100 new users, respectively. The dashed line $f = 6$ is the value of the $\alpha$ parameter used for the global Slush algorithm. Top row: strong classifiers turn Byzantine. Bottom row: weak classifiers turn Byzantine.
...and 3 more figures

Theorems & Definitions (41)

Definition 2.1: Ensemble
Definition 2.2: Base learner
Definition 2.3: Honest participant
Definition 2.4: Accuracy
Definition 2.5: Independence
Definition 2.6: Homogeneity
Definition 2.7: Diversity
Definition 2.8: Slush absorption rates
Remark 2.9
Lemma 2.10
...and 31 more

Consensus learning: A novel decentralised ensemble learning paradigm

TL;DR

Abstract

Consensus learning: A novel decentralised ensemble learning paradigm

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (41)