Table of Contents
Fetching ...

Tricks and Plug-ins for Gradient Boosting in Image Classification

Biyi Fang, Truong Vo, Jean Utke, Diego Klabjan

TL;DR

This work addresses the high computational cost of CNN architecture search by introducing Subgrid BoostCNN, a boosting-based framework for image classification that trains shallow weak learners on informative subgrids. It combines a pixel-wise importance index $I_{j,k}$ derived from residuals with a squared-error objective to embed boosting weights into training, and forms an ensemble $f(x)=sum_{t=1}^N alpha_t g_t(x)$. It also uses architectural reuse to avoid re-optimizing the entire network across boosting iterations. Experiments on CIFAR-10, SVHN, and ImageNetSub show improved accuracy and training efficiency over standard CNNs and BoostCNN baselines, with robustness to seeds and compatibility across ResNet backbones.

Abstract

Convolutional Neural Networks (CNNs) have achieved remarkable success across a wide range of machine learning tasks by leveraging hierarchical feature learning through deep architectures. However, the large number of layers and millions of parameters often make CNNs computationally expensive to train, requiring extensive time and manual tuning to discover optimal architectures. In this paper, we introduce a novel framework for boosting CNN performance that integrates dynamic feature selection with the principles of BoostCNN. Our approach incorporates two key strategies: subgrid selection and importance sampling, to guide training toward informative regions of the feature space. We further develop a family of algorithms that embed boosting weights directly into the network training process using a least squares loss formulation. This integration not only alleviates the burden of manual architecture design but also enhances accuracy and efficiency. Experimental results across several fine-grained classification benchmarks demonstrate that our boosted CNN variants consistently outperform conventional CNNs in both predictive performance and training speed.

Tricks and Plug-ins for Gradient Boosting in Image Classification

TL;DR

This work addresses the high computational cost of CNN architecture search by introducing Subgrid BoostCNN, a boosting-based framework for image classification that trains shallow weak learners on informative subgrids. It combines a pixel-wise importance index derived from residuals with a squared-error objective to embed boosting weights into training, and forms an ensemble . It also uses architectural reuse to avoid re-optimizing the entire network across boosting iterations. Experiments on CIFAR-10, SVHN, and ImageNetSub show improved accuracy and training efficiency over standard CNNs and BoostCNN baselines, with robustness to seeds and compatibility across ResNet backbones.

Abstract

Convolutional Neural Networks (CNNs) have achieved remarkable success across a wide range of machine learning tasks by leveraging hierarchical feature learning through deep architectures. However, the large number of layers and millions of parameters often make CNNs computationally expensive to train, requiring extensive time and manual tuning to discover optimal architectures. In this paper, we introduce a novel framework for boosting CNN performance that integrates dynamic feature selection with the principles of BoostCNN. Our approach incorporates two key strategies: subgrid selection and importance sampling, to guide training toward informative regions of the feature space. We further develop a family of algorithms that embed boosting weights directly into the network training process using a least squares loss formulation. This integration not only alleviates the burden of manual architecture design but also enhances accuracy and efficiency. Experimental results across several fine-grained classification benchmarks demonstrate that our boosted CNN variants consistently outperform conventional CNNs in both predictive performance and training speed.

Paper Structure

This paper contains 7 sections, 10 equations, 12 figures, 2 tables, 1 algorithm.

Figures (12)

  • Figure 1: ResNet-18 on CIFAR-10
  • Figure 2: Different Seeds
  • Figure 3: ResNet-18 on SVHN
  • Figure 4: Different Seeds
  • Figure 5: ResNet-18 on ImageNetSub
  • ...and 7 more figures