Table of Contents
Fetching ...

Performance-bounded Online Ensemble Learning Method Based on Multi-armed bandits and Its Applications in Real-time Safety Assessment

Songqiao Hu, Zeyi Liu, Xiao He

TL;DR

A performance-bounded online ensemble learning method based on multi-armed bandits, named PB-OEL, incorporated into online ensemble learning, aiming to update the weights of base classifiers and make predictions, demonstrates that the proposed method outperforms existing state-of-the-art methods.

Abstract

Ensemble learning plays a crucial role in practical applications of online learning due to its enhanced classification performance and adaptable adjustment mechanisms. However, most weight allocation strategies in ensemble learning are heuristic, making it challenging to theoretically guarantee that the ensemble classifier outperforms its base classifiers. To address this issue, a performance-bounded online ensemble learning method based on multi-armed bandits, named PB-OEL, is proposed in this paper. Specifically, multi-armed bandit with expert advice is incorporated into online ensemble learning, aiming to update the weights of base classifiers and make predictions. A theoretical framework is established to bound the performance of the ensemble classifier relative to base classifiers. By setting expert advice of bandits, the bound exceeds the performance of any base classifier when the length of data stream is sufficiently large. Additionally, performance bounds for scenarios with limited annotations are also derived. Numerous experiments on benchmark datasets and a dataset of real-time safety assessment tasks are conducted. The experimental results validate the theoretical bound to a certain extent and demonstrate that the proposed method outperforms existing state-of-the-art methods.

Performance-bounded Online Ensemble Learning Method Based on Multi-armed bandits and Its Applications in Real-time Safety Assessment

TL;DR

A performance-bounded online ensemble learning method based on multi-armed bandits, named PB-OEL, incorporated into online ensemble learning, aiming to update the weights of base classifiers and make predictions, demonstrates that the proposed method outperforms existing state-of-the-art methods.

Abstract

Ensemble learning plays a crucial role in practical applications of online learning due to its enhanced classification performance and adaptable adjustment mechanisms. However, most weight allocation strategies in ensemble learning are heuristic, making it challenging to theoretically guarantee that the ensemble classifier outperforms its base classifiers. To address this issue, a performance-bounded online ensemble learning method based on multi-armed bandits, named PB-OEL, is proposed in this paper. Specifically, multi-armed bandit with expert advice is incorporated into online ensemble learning, aiming to update the weights of base classifiers and make predictions. A theoretical framework is established to bound the performance of the ensemble classifier relative to base classifiers. By setting expert advice of bandits, the bound exceeds the performance of any base classifier when the length of data stream is sufficiently large. Additionally, performance bounds for scenarios with limited annotations are also derived. Numerous experiments on benchmark datasets and a dataset of real-time safety assessment tasks are conducted. The experimental results validate the theoretical bound to a certain extent and demonstrate that the proposed method outperforms existing state-of-the-art methods.

Paper Structure

This paper contains 24 sections, 5 theorems, 29 equations, 9 figures, 2 tables, 2 algorithms.

Key Result

Theorem 1

(Performance Bound of PB-OEL): Let $\pi$ be the REXP4 policy with $\gamma=\min\{1,\sqrt{{K\ln N}/{\left[(e-1)\Delta_T\right]}}\}$, $\Delta_T=T^\alpha$, $\alpha\in(0,1]$. The label $y_t$ of $\bm{x_t}$ is obtained after prediction. The accuracy (ACC) of the ensemble classifier has the theoretical boun where $m_T$ represents the number of batches in $\mathcal{T}$, $\xi_{n}^{y_t}(t)$ denotes the $y_t$

Figures (9)

  • Figure 1: Diagram of the setting: Possible classes in OEL correspond to arms in MAB-EA; Classifiers in OEL correspond to experts in MAB-EA.
  • Figure 2: The change of accuracy with different parameter combinations on representative datasets.
  • Figure 3: (a) Average accuracy of different parameter combinations, and (b) The Regret of different $T$ and $\alpha$ values on Waveform, $N=10$.
  • Figure 4: Accuracy of different methods on binary data streams. Shaded areas represent the standard deviation results of the corresponding method under multiple random runs.
  • Figure 5: Accuracy of different methods on multi-class data streams. Shaded areas represent the standard deviation results of the corresponding method under multiple random runs.
  • ...and 4 more figures

Theorems & Definitions (12)

  • Theorem 1
  • Proof
  • Corollary 1
  • Proof
  • Remark 1
  • Corollary 2
  • Proof
  • Remark 2
  • Theorem 2
  • Proof
  • ...and 2 more