Table of Contents
Fetching ...

Superpixel Integrated Grids for Fast Image Segmentation

Jack Roberts, Jeova Farias Sales Rocha Neto

TL;DR

SIGRID introduces a novel input primitive for image segmentation that replaces full-resolution images with a regular, compact grid of superpixel descriptors. By mapping each superpixel to a grid cell and storing color and shape features (notably average color and Hu moments) in a tensor $S ∈ R^{d × w' × h'}$, the approach preserves meaningful boundaries while enabling standard CNNs to operate efficiently. Empirical results on four benchmarks show that SIGRID matches or surpasses pixel-level baselines with substantial speedups (up to 4× faster training and up to 6× fewer GFLOPs), demonstrating a favorable accuracy-efficiency balance. The work suggests practical avenues for acceleration via sparse convolutions and extension to multiregion segmentation, highlighting SIGRID as a viable path toward scalable segmentation pipelines.

Abstract

Superpixels have long been used in image simplification to enable more efficient data processing and storage. However, despite their computational potential, their irregular spatial distribution has often forced deep learning approaches to rely on specialized training algorithms and architectures, undermining the original motivation for superpixelations. In this work, we introduce a new superpixel-based data structure, SIGRID (Superpixel-Integrated Grid), as an alternative to full-resolution images in segmentation tasks. By leveraging classical shape descriptors, SIGRID encodes both color and shape information of superpixels while substantially reducing input dimensionality. We evaluate SIGRIDs on four benchmark datasets using two popular convolutional segmentation architectures. Our results show that, despite compressing the original data, SIGRIDs not only match but in some cases surpass the performance of pixel-level representations, all while significantly accelerating model training. This demonstrates that SIGRIDs achieve a favorable balance between accuracy and computational efficiency.

Superpixel Integrated Grids for Fast Image Segmentation

TL;DR

SIGRID introduces a novel input primitive for image segmentation that replaces full-resolution images with a regular, compact grid of superpixel descriptors. By mapping each superpixel to a grid cell and storing color and shape features (notably average color and Hu moments) in a tensor , the approach preserves meaningful boundaries while enabling standard CNNs to operate efficiently. Empirical results on four benchmarks show that SIGRID matches or surpasses pixel-level baselines with substantial speedups (up to 4× faster training and up to 6× fewer GFLOPs), demonstrating a favorable accuracy-efficiency balance. The work suggests practical avenues for acceleration via sparse convolutions and extension to multiregion segmentation, highlighting SIGRID as a viable path toward scalable segmentation pipelines.

Abstract

Superpixels have long been used in image simplification to enable more efficient data processing and storage. However, despite their computational potential, their irregular spatial distribution has often forced deep learning approaches to rely on specialized training algorithms and architectures, undermining the original motivation for superpixelations. In this work, we introduce a new superpixel-based data structure, SIGRID (Superpixel-Integrated Grid), as an alternative to full-resolution images in segmentation tasks. By leveraging classical shape descriptors, SIGRID encodes both color and shape information of superpixels while substantially reducing input dimensionality. We evaluate SIGRIDs on four benchmark datasets using two popular convolutional segmentation architectures. Our results show that, despite compressing the original data, SIGRIDs not only match but in some cases surpass the performance of pixel-level representations, all while significantly accelerating model training. This demonstrates that SIGRIDs achieve a favorable balance between accuracy and computational efficiency.

Paper Structure

This paper contains 12 sections, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Sigrid construction and model inference for binary segmentation. (a) Original image $I$. (b) Superpixelation $S$ ($K = 6$ here) computed from $I$ and superpixel centers (red dots). (c) An $w'\times h'$ grid superimposed on $S$ where superpixels are assigned to grid cells. (d) Assigned grid cells are then populated with superpixel descriptors. (e) The model classifies each assigned grid cell in one of two classes. (f) Cell classes are converted back to pixel-level classification using the original superpixels.
  • Figure 2: Architectural details of the CNNs in our experiments.