Table of Contents
Fetching ...

ExtremeMETA: High-speed Lightweight Image Segmentation Model by Remodeling Multi-channel Metamaterial Imagers

Quan Liu, Brandon T. Swartz, Ivan Kravchenko, Jason G. Valentine, Yuankai Huo

TL;DR

ExtremeMETA tackles the energy and latency challenges of deep learning for segmentation by integrating a large-kernel digital design with a metamaterial optic front-end. It extends the receptive field via a multi-path large-kernel first layer and uses model compression with sparse convolution to reduce digital FLOPs while preserving accuracy. Across EG1800, Stanford Car, and KITTI, ExtremeMETA improves mIoU from 92.45% to 95.97% and reduces FLOPs from 461.07 MMacs to 166.03 MMacs, demonstrating the viability of hybrid optic-digital segmentation. The work suggests broad applicability of large-kernel hybrid architectures for edge AI and points to extensions in detection and video tasks.

Abstract

Deep neural networks (DNNs) have heavily relied on traditional computational units like CPUs and GPUs. However, this conventional approach brings significant computational burdens, latency issues, and high power consumption, limiting their effectiveness. This has sparked the need for lightweight networks like ExtremeC3Net. On the other hand, there have been notable advancements in optical computational units, particularly with metamaterials, offering the exciting prospect of energy-efficient neural networks operating at the speed of light. Yet, the digital design of metamaterial neural networks (MNNs) faces challenges such as precision, noise, and bandwidth, limiting their application to intuitive tasks and low-resolution images. In this paper, we propose a large kernel lightweight segmentation model, ExtremeMETA. Based on the ExtremeC3Net, the ExtremeMETA maximizes the ability of the first convolution layer by exploring a larger convolution kernel and multiple processing paths. With the proposed large kernel convolution model, we extend the optic neural network application boundary to the segmentation task. To further lighten the computation burden of the digital processing part, a set of model compression methods is applied to improve model efficiency in the inference stage. The experimental results on three publicly available datasets demonstrate that the optimized efficient design improved segmentation performance from 92.45 to 95.97 on mIoU while reducing computational FLOPs from 461.07 MMacs to 166.03 MMacs. The proposed the large kernel lightweight model ExtremeMETA showcases the hybrid design's ability on complex tasks.

ExtremeMETA: High-speed Lightweight Image Segmentation Model by Remodeling Multi-channel Metamaterial Imagers

TL;DR

ExtremeMETA tackles the energy and latency challenges of deep learning for segmentation by integrating a large-kernel digital design with a metamaterial optic front-end. It extends the receptive field via a multi-path large-kernel first layer and uses model compression with sparse convolution to reduce digital FLOPs while preserving accuracy. Across EG1800, Stanford Car, and KITTI, ExtremeMETA improves mIoU from 92.45% to 95.97% and reduces FLOPs from 461.07 MMacs to 166.03 MMacs, demonstrating the viability of hybrid optic-digital segmentation. The work suggests broad applicability of large-kernel hybrid architectures for edge AI and points to extensions in detection and video tasks.

Abstract

Deep neural networks (DNNs) have heavily relied on traditional computational units like CPUs and GPUs. However, this conventional approach brings significant computational burdens, latency issues, and high power consumption, limiting their effectiveness. This has sparked the need for lightweight networks like ExtremeC3Net. On the other hand, there have been notable advancements in optical computational units, particularly with metamaterials, offering the exciting prospect of energy-efficient neural networks operating at the speed of light. Yet, the digital design of metamaterial neural networks (MNNs) faces challenges such as precision, noise, and bandwidth, limiting their application to intuitive tasks and low-resolution images. In this paper, we propose a large kernel lightweight segmentation model, ExtremeMETA. Based on the ExtremeC3Net, the ExtremeMETA maximizes the ability of the first convolution layer by exploring a larger convolution kernel and multiple processing paths. With the proposed large kernel convolution model, we extend the optic neural network application boundary to the segmentation task. To further lighten the computation burden of the digital processing part, a set of model compression methods is applied to improve model efficiency in the inference stage. The experimental results on three publicly available datasets demonstrate that the optimized efficient design improved segmentation performance from 92.45 to 95.97 on mIoU while reducing computational FLOPs from 461.07 MMacs to 166.03 MMacs. The proposed the large kernel lightweight model ExtremeMETA showcases the hybrid design's ability on complex tasks.
Paper Structure (21 sections, 2 equations, 5 figures, 4 tables)

This paper contains 21 sections, 2 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: This study provides a hybrid pipeline for designing and optimizing a large kernel digital neural network. The proposed ExtremeMETA is efficient for segmentation tasks with less FLOPs in computation.
  • Figure 2: Lightweight segmentation model with hybrid meta optics design. The model has two parts: CoarseNet and FineNet. The large kernel block is composed of depthwise convolution layers.
  • Figure 3: Model compression on segmentation model digital processing part. The left panel shows the multipath structure of the advanced C3 block. The right panel shows the compression mechanism.
  • Figure 4: Model ablation study. Left panel: trade-off between input image size and channel number of convolution layer. Right panel: model efficiency visualization comparing model FLOPs and mIoU.
  • Figure 5: Model compression performance. Left panel: origin model, ExtremeMETA, and compressed model parameters comparison; right panel: model performance after compression.