Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method

Jinghui Yuan; Weijin Jiang; Zhe Cao; Fangyuan Xie; Rong Wang; Feiping Nie; Yuan Yuan

Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method

Jinghui Yuan, Weijin Jiang, Zhe Cao, Fangyuan Xie, Rong Wang, Feiping Nie, Yuan Yuan

TL;DR

The paper addresses the challenge of attaining large-ensemble performance with a small number of base learners by learning per-class confidences and a margin-based generalization objective. It introduces a learnable confidence tensor $\tilde{\mathbf{\Theta}}$ that is unfolded to $\Theta$, and a smooth, partially convex loss $\mathcal{L}=\mathcal{C}-\gamma\mathcal{M}$ based on a logsumexp margin, together with a linear constraint $\Theta^T\mathbf{1}=\tilde{w}$ that encodes base-learner reliabilities. The authors prove convexity with respect to $\mathcal{S}(\Theta g_i)$ and a gradient-sum-to-zero property that enables efficient constrained gradient descent, and they demonstrate that a small learned ensemble can outperform a tenfold larger Random Forest on multiple real and toy datasets. The results suggest practical benefits for efficient ensemble learning, with potential extensions to stacking and heterogeneous base classifiers, offering a scalable approach to building strong classifiers with reduced compute.

Abstract

Ensemble learning is a method that leverages weak learners to produce a strong learner. However, obtaining a large number of base learners requires substantial time and computational resources. Therefore, it is meaningful to study how to achieve the performance typically obtained with many base learners using only a few. We argue that to achieve this, it is essential to enhance both classification performance and generalization ability during the ensemble process. To increase model accuracy, each weak base learner needs to be more efficiently integrated. It is observed that different base learners exhibit varying levels of accuracy in predicting different classes. To capitalize on this, we introduce confidence tensors $\tilde{\mathbfΘ}$ and $\tilde{\mathbfΘ}_{rst}$ signifies the degree of confidence that the $t$-th base classifier assigns the sample to class $r$ while it actually belongs to class $s$. To the best of our knowledge, this is the first time an evaluation of the performance of base classifiers across different classes has been proposed. The proposed confidence tensor compensates for the strengths and weaknesses of each base classifier in different classes, enabling the method to achieve superior results with a smaller number of base learners. To enhance generalization performance, we design a smooth and convex objective function that leverages the concept of margin, making the strong learner more discriminative. Furthermore, it is proved that in gradient matrix of the loss function, the sum of each column's elements is zero, allowing us to solve a constrained optimization problem using gradient-based methods. We then compare our algorithm with random forests of ten times the size and other classical methods across numerous datasets, demonstrating the superiority of our approach.

Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method

TL;DR

that is unfolded to

, and a smooth, partially convex loss

based on a logsumexp margin, together with a linear constraint

that encodes base-learner reliabilities. The authors prove convexity with respect to

and a gradient-sum-to-zero property that enables efficient constrained gradient descent, and they demonstrate that a small learned ensemble can outperform a tenfold larger Random Forest on multiple real and toy datasets. The results suggest practical benefits for efficient ensemble learning, with potential extensions to stacking and heterogeneous base classifiers, offering a scalable approach to building strong classifiers with reduced compute.

Abstract

and

signifies the degree of confidence that the

-th base classifier assigns the sample to class

while it actually belongs to class

. To the best of our knowledge, this is the first time an evaluation of the performance of base classifiers across different classes has been proposed. The proposed confidence tensor compensates for the strengths and weaknesses of each base classifier in different classes, enabling the method to achieve superior results with a smaller number of base learners. To enhance generalization performance, we design a smooth and convex objective function that leverages the concept of margin, making the strong learner more discriminative. Furthermore, it is proved that in gradient matrix of the loss function, the sum of each column's elements is zero, allowing us to solve a constrained optimization problem using gradient-based methods. We then compare our algorithm with random forests of ten times the size and other classical methods across numerous datasets, demonstrating the superiority of our approach.

Paper Structure (28 sections, 23 equations, 5 figures, 4 tables, 1 algorithm)

This paper contains 28 sections, 23 equations, 5 figures, 4 tables, 1 algorithm.

Introduction
Methodology
Notations
Introduction of Margin
Introduction of Loss Function
Introduction of Optimization Problem
Optimization Algorithm
Theorem 1.
Proof
Theorem 2.
Proof
Time Complexity Analysis
Experiments on toy datasets
Experiments on real datasets
Experimental Settings
...and 13 more sections

Figures (5)

Figure 1: Expansion diagram of $\tilde{\mathbf{\Theta}}$
Figure 2: Initialization Diagram
Figure 3: Performance on toy Dataset. (a) Original dataset. (b) OUR10. (c) SVC. (d) XGBoost. (e) RF10. (f) RF20. (g) RF30. (h) RF100.
Figure 4: Convergence curve of ANCMM. (a) Movement. (b) TR41. (c) warpPIE10P. (d) Lung. (f) Arcene. (g) Isolet.
Figure 5: Learned $\Theta$

Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method

TL;DR

Abstract

Achieving More with Less: A Tensor-Optimization-Powered Ensemble Method

Authors

TL;DR

Abstract

Table of Contents

Figures (5)