Table of Contents
Fetching ...

Machine Collaboration

Qingfeng Liu, Yang Feng

TL;DR

MaC introduces a circular, interactive ensemble that combines heterogeneous base learners to exchange information and update predictions across rounds, offering a departure from traditional bagging, stacking, and boosting. The method is formalized with a two-machine sketch and a general $K_n$-machine algorithm, coupled with a finite-sample risk bound that highlights trade-offs between approximation error and complexity. Empirically, MaC delivers consistent improvements over individual models and standard ensembles across simulated data and 119 PMLB regression datasets, with statistical evidence supporting its gains. The work signals a new direction in ensemble design centered on inter-machine communication, with potential extensions to classification and semi-supervised learning, balanced against computational demands.

Abstract

We propose a new ensemble framework for supervised learning, called machine collaboration (MaC), using a collection of base machines for prediction tasks. Unlike bagging/stacking (a parallel & independent framework) and boosting (a sequential & top-down framework), MaC is a type of circular & interactive learning framework. The circular & interactive feature helps the base machines to transfer information circularly and update their structures and parameters accordingly. The theoretical result on the risk bound of the estimator from MaC reveals that the circular & interactive feature can help MaC reduce risk via a parsimonious ensemble. We conduct extensive experiments on MaC using both simulated data and 119 benchmark real datasets. The results demonstrate that in most cases, MaC performs significantly better than several other state-of-the-art methods, including classification and regression trees, neural networks, stacking, and boosting.

Machine Collaboration

TL;DR

MaC introduces a circular, interactive ensemble that combines heterogeneous base learners to exchange information and update predictions across rounds, offering a departure from traditional bagging, stacking, and boosting. The method is formalized with a two-machine sketch and a general -machine algorithm, coupled with a finite-sample risk bound that highlights trade-offs between approximation error and complexity. Empirically, MaC delivers consistent improvements over individual models and standard ensembles across simulated data and 119 PMLB regression datasets, with statistical evidence supporting its gains. The work signals a new direction in ensemble design centered on inter-machine communication, with potential extensions to classification and semi-supervised learning, balanced against computational demands.

Abstract

We propose a new ensemble framework for supervised learning, called machine collaboration (MaC), using a collection of base machines for prediction tasks. Unlike bagging/stacking (a parallel & independent framework) and boosting (a sequential & top-down framework), MaC is a type of circular & interactive learning framework. The circular & interactive feature helps the base machines to transfer information circularly and update their structures and parameters accordingly. The theoretical result on the risk bound of the estimator from MaC reveals that the circular & interactive feature can help MaC reduce risk via a parsimonious ensemble. We conduct extensive experiments on MaC using both simulated data and 119 benchmark real datasets. The results demonstrate that in most cases, MaC performs significantly better than several other state-of-the-art methods, including classification and regression trees, neural networks, stacking, and boosting.

Paper Structure

This paper contains 9 sections, 2 theorems, 13 equations, 4 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Assume a constant $C_{0}<\infty$ exists, $\left|Y\right|\leq C_{0}$$a.s.$ and $\sup_{\mathfrak{m}\in\mathbb{M}}\sup_{X\in\mathcal{X}}\left|\mathfrak{m}\left(X\right)\right|\leq C_{0}$. Define $C_{1}\equiv4C_{0}^{2}$ and $C_{2}\equiv16C_{0}^{2}$. Let $L\left(D,\mathcal{\mathfrak{m}}\right)\equiv\left where $\tilde{B}_{0}\left(k\right)=\min_{\mathfrak{m}\in\mathbb{M}_{k,\tilde{\Theta}_{k}}}\int\left

Figures (4)

  • Figure 1: Bagging, boosting, and machine collaboration
  • Figure 2: Machine Collaboration
  • Figure 3: MaC vs alternatives
  • Figure 4: MSPE difference ($\mathrm{Alternative}-\mathrm{MaC}$)

Theorems & Definitions (5)

  • Definition 3.1: Searching Number and Searching Resolution
  • Theorem 1
  • Remark 2
  • Proposition 1
  • proof