Table of Contents
Fetching ...

BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network

Zongkai Zhang, Zidong Xu, Wenming Yang, Qingmin Liao, Jing-Hao Xue

TL;DR

The paper addresses the resource-intensive nature of 3D occupancy networks and the performance gap of binarized models. It introduces Binarized Deep Convolution (BDC) units, justified by a theoretical result that favors $1\times1$ binarized convolutions after an initial $3\times3$ stage and augmented with a per-channel weight branch (BDConv). The BDConv-based BDC-Occ decomposes the 3D occupancy network into four modules and binarizes each, achieving state-of-the-art results among binarized methods on Occ3D-nuScenes and approaching full-precision performance with substantial efficiency gains. The work demonstrates substantial improvements in mIoU and robust 3D object detection, highlighting the practical impact for edge devices in autonomous systems. A limitation is the lack of validation on Transformer-based architectures, suggesting future work to broaden applicability.

Abstract

Existing 3D occupancy networks demand significant hardware resources, hindering the deployment of edge devices. Binarized Neural Networks (BNN) offer substantially reduced computational and memory requirements. However, their performance decreases notably compared to full-precision networks. Moreover, it is challenging to enhance the performance of binarized models by increasing the number of binarized convolutional layers, which limits their practicability for 3D occupancy prediction. To bridge these gaps, we propose a novel binarized deep convolution (BDC) unit that effectively enhances performance while increasing the number of binarized convolutional layers. Firstly, through theoretical analysis, we demonstrate that 1 \times 1 binarized convolutions introduce minimal binarization errors. Therefore, additional binarized convolutional layers are constrained to 1 \times 1 in the BDC unit. Secondly, we introduce the per-channel weight branch to mitigate the impact of binarization errors from unimportant channel features on the performance of binarized models, thereby improving performance while increasing the number of binarized convolutional layers. Furthermore, we decompose the 3D occupancy network into four convolutional modules and utilize the proposed BDC unit to binarize these modules. Our BDC-Occ model is created by applying the proposed BDC unit to binarize the existing 3D occupancy networks. Comprehensive quantitative and qualitative experiments demonstrate that the proposed BDC-Occ is the state-of-the-art binarized 3D occupancy network algorithm.

BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network

TL;DR

The paper addresses the resource-intensive nature of 3D occupancy networks and the performance gap of binarized models. It introduces Binarized Deep Convolution (BDC) units, justified by a theoretical result that favors binarized convolutions after an initial stage and augmented with a per-channel weight branch (BDConv). The BDConv-based BDC-Occ decomposes the 3D occupancy network into four modules and binarizes each, achieving state-of-the-art results among binarized methods on Occ3D-nuScenes and approaching full-precision performance with substantial efficiency gains. The work demonstrates substantial improvements in mIoU and robust 3D object detection, highlighting the practical impact for edge devices in autonomous systems. A limitation is the lack of validation on Transformer-based architectures, suggesting future work to broaden applicability.

Abstract

Existing 3D occupancy networks demand significant hardware resources, hindering the deployment of edge devices. Binarized Neural Networks (BNN) offer substantially reduced computational and memory requirements. However, their performance decreases notably compared to full-precision networks. Moreover, it is challenging to enhance the performance of binarized models by increasing the number of binarized convolutional layers, which limits their practicability for 3D occupancy prediction. To bridge these gaps, we propose a novel binarized deep convolution (BDC) unit that effectively enhances performance while increasing the number of binarized convolutional layers. Firstly, through theoretical analysis, we demonstrate that 1 \times 1 binarized convolutions introduce minimal binarization errors. Therefore, additional binarized convolutional layers are constrained to 1 \times 1 in the BDC unit. Secondly, we introduce the per-channel weight branch to mitigate the impact of binarization errors from unimportant channel features on the performance of binarized models, thereby improving performance while increasing the number of binarized convolutional layers. Furthermore, we decompose the 3D occupancy network into four convolutional modules and utilize the proposed BDC unit to binarize these modules. Our BDC-Occ model is created by applying the proposed BDC unit to binarize the existing 3D occupancy networks. Comprehensive quantitative and qualitative experiments demonstrate that the proposed BDC-Occ is the state-of-the-art binarized 3D occupancy network algorithm.
Paper Structure (23 sections, 27 equations, 8 figures, 6 tables)

This paper contains 23 sections, 27 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Comparison between our BDC and state-of-the-art BNNs in the 3D occupancy prediction and 3D object detection tasks. For the 3D occupancy prediction task, Base means binarizing the BEV encoder and occupancy head, Tiny means further binarizing the image neck based on Base. For the 3D object detection task, all binarized models are in Tiny.
  • Figure 2: CNN-based 3D Occupancy Network
  • Figure 3: The illustration of the improvement process of our BDC.
  • Figure 4: The illustration of binarized convolution module based on BDC.
  • Figure 5: Ablation study of multi-layer binarized convolution (MulBiconv)
  • ...and 3 more figures