Table of Contents
Fetching ...

ReActXGB: A Hybrid Binary Convolutional Neural Network Architecture for Improved Performance and Computational Efficiency

Po-Hsun Chu, Ching-Han Chen

TL;DR

The paper tackles the accuracy-cost trade-off in binary convolutional networks by addressing the fully connected layer bottleneck. It introduces ReActXGB, a hybrid architecture that uses ReActNet-A as a feature extractor and XGBoost as a classifier to reduce hardware costs while improving accuracy. Training follows a two-stage process: first optimize a fully connected classifier with SGD, then train XGBoost on the extracted features, capping the ensemble to 20 trees with a maximum depth of 10. On FashionMNIST, ReActXGB achieves 90.38% top-1 accuracy—1.47 points higher than ReActNet-A—while reducing FLOPs by 7.14% and parameters by 1.02%, and it narrows the performance gap to ResNet-18, indicating strong potential for energy-efficient, edge-ready deployment; future hardware implementations on FPGA are planned to evaluate real-world metrics.

Abstract

Binary convolutional neural networks (BCNNs) provide a potential solution to reduce the memory requirements and computational costs associated with deep neural networks (DNNs). However, achieving a trade-off between performance and computational resources remains a significant challenge. Furthermore, the fully connected layer of BCNNs has evolved into a significant computational bottleneck. This is mainly due to the conventional practice of excluding the input layer and fully connected layer from binarization to prevent a substantial loss in accuracy. In this paper, we propose a hybrid model named ReActXGB, where we replace the fully convolutional layer of ReActNet-A with XGBoost. This modification targets to narrow the performance gap between BCNNs and real-valued networks while maintaining lower computational costs. Experimental results on the FashionMNIST benchmark demonstrate that ReActXGB outperforms ReActNet-A by 1.47% in top-1 accuracy, along with a reduction of 7.14% in floating-point operations (FLOPs) and 1.02% in model size.

ReActXGB: A Hybrid Binary Convolutional Neural Network Architecture for Improved Performance and Computational Efficiency

TL;DR

The paper tackles the accuracy-cost trade-off in binary convolutional networks by addressing the fully connected layer bottleneck. It introduces ReActXGB, a hybrid architecture that uses ReActNet-A as a feature extractor and XGBoost as a classifier to reduce hardware costs while improving accuracy. Training follows a two-stage process: first optimize a fully connected classifier with SGD, then train XGBoost on the extracted features, capping the ensemble to 20 trees with a maximum depth of 10. On FashionMNIST, ReActXGB achieves 90.38% top-1 accuracy—1.47 points higher than ReActNet-A—while reducing FLOPs by 7.14% and parameters by 1.02%, and it narrows the performance gap to ResNet-18, indicating strong potential for energy-efficient, edge-ready deployment; future hardware implementations on FPGA are planned to evaluate real-world metrics.

Abstract

Binary convolutional neural networks (BCNNs) provide a potential solution to reduce the memory requirements and computational costs associated with deep neural networks (DNNs). However, achieving a trade-off between performance and computational resources remains a significant challenge. Furthermore, the fully connected layer of BCNNs has evolved into a significant computational bottleneck. This is mainly due to the conventional practice of excluding the input layer and fully connected layer from binarization to prevent a substantial loss in accuracy. In this paper, we propose a hybrid model named ReActXGB, where we replace the fully convolutional layer of ReActNet-A with XGBoost. This modification targets to narrow the performance gap between BCNNs and real-valued networks while maintaining lower computational costs. Experimental results on the FashionMNIST benchmark demonstrate that ReActXGB outperforms ReActNet-A by 1.47% in top-1 accuracy, along with a reduction of 7.14% in floating-point operations (FLOPs) and 1.02% in model size.
Paper Structure (6 sections, 1 figure, 3 tables)

This paper contains 6 sections, 1 figure, 3 tables.

Figures (1)

  • Figure 1: ReActXGB model architecture