Table of Contents
Fetching ...

Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation

Dongyue Wu, Zilin Guo, Li Yu, Nong Sang, Changxin Gao

TL;DR

Semantic segmentation faces high computational cost, limiting deployment on resource-constrained devices; existing pruning methods largely target classification and neglect spatial information crucial for localization.The authors propose SIRFP, which computes a spatially aware redundancy metric from unpooled high-resolution feature maps and frames pruning as a Maximum Edge Weight Clique Problem (MEWCP) on a channel graph, solved efficiently by a greedy heuristic (EHGP).Extensive experiments across Cityscapes, ADE20K, and COCO Stuff-10K show that SIRFP achieves superior parameter-mIoU trade-offs and generalizes to object detection and image classification, outperforming several state-of-the-art pruning methods.The approach enables more efficient, localization-preserving segmentation suitable for real-time and edge deployments, by directly accounting for spatial information during pruning and using a scalable optimization strategy.

Abstract

In recent years, semantic segmentation has flourished in various applications. However, the high computational cost remains a significant challenge that hinders its further adoption. The filter pruning method for structured network slimming offers a direct and effective solution for the reduction of segmentation networks. Nevertheless, we argue that most existing pruning methods, originally designed for image classification, overlook the fact that segmentation is a location-sensitive task, which consequently leads to their suboptimal performance when applied to segmentation networks. To address this issue, this paper proposes a novel approach, denoted as Spatial-aware Information Redundancy Filter Pruning~(SIRFP), which aims to reduce feature redundancy between channels. First, we formulate the pruning process as a maximum edge weight clique problem~(MEWCP) in graph theory, thereby minimizing the redundancy among the remaining features after pruning. Within this framework, we introduce a spatial-aware redundancy metric based on feature maps, thus endowing the pruning process with location sensitivity to better adapt to pruning segmentation networks. Additionally, based on the MEWCP, we propose a low computational complexity greedy strategy to solve this NP-hard problem, making it feasible and efficient for structured pruning. To validate the effectiveness of our method, we conducted extensive comparative experiments on various challenging datasets. The results demonstrate the superior performance of SIRFP for semantic segmentation tasks.

Structural Pruning via Spatial-aware Information Redundancy for Semantic Segmentation

TL;DR

Semantic segmentation faces high computational cost, limiting deployment on resource-constrained devices; existing pruning methods largely target classification and neglect spatial information crucial for localization.The authors propose SIRFP, which computes a spatially aware redundancy metric from unpooled high-resolution feature maps and frames pruning as a Maximum Edge Weight Clique Problem (MEWCP) on a channel graph, solved efficiently by a greedy heuristic (EHGP).Extensive experiments across Cityscapes, ADE20K, and COCO Stuff-10K show that SIRFP achieves superior parameter-mIoU trade-offs and generalizes to object detection and image classification, outperforming several state-of-the-art pruning methods.The approach enables more efficient, localization-preserving segmentation suitable for real-time and edge deployments, by directly accounting for spatial information during pruning and using a scalable optimization strategy.

Abstract

In recent years, semantic segmentation has flourished in various applications. However, the high computational cost remains a significant challenge that hinders its further adoption. The filter pruning method for structured network slimming offers a direct and effective solution for the reduction of segmentation networks. Nevertheless, we argue that most existing pruning methods, originally designed for image classification, overlook the fact that segmentation is a location-sensitive task, which consequently leads to their suboptimal performance when applied to segmentation networks. To address this issue, this paper proposes a novel approach, denoted as Spatial-aware Information Redundancy Filter Pruning~(SIRFP), which aims to reduce feature redundancy between channels. First, we formulate the pruning process as a maximum edge weight clique problem~(MEWCP) in graph theory, thereby minimizing the redundancy among the remaining features after pruning. Within this framework, we introduce a spatial-aware redundancy metric based on feature maps, thus endowing the pruning process with location sensitivity to better adapt to pruning segmentation networks. Additionally, based on the MEWCP, we propose a low computational complexity greedy strategy to solve this NP-hard problem, making it feasible and efficient for structured pruning. To validate the effectiveness of our method, we conducted extensive comparative experiments on various challenging datasets. The results demonstrate the superior performance of SIRFP for semantic segmentation tasks.

Paper Structure

This paper contains 28 sections, 6 equations, 3 figures, 10 tables, 2 algorithms.

Figures (3)

  • Figure 1: Params vs mIOU on Cityscapes dataset. The proposed Spatial-aware Information Redundancy Filter Pruning (SIRFP) method achieves an outstanding trade-off between parameter efficiency and accuracy. The pruning methods are connected with lines while real-time methods are colored in grey. Best viewed in color.
  • Figure 2: Visualization comparison with DCFP dcfp on Cityscapes validation set with 60% FLOPs reduction of Deeplabv3-ResNet50. Black in GT denotes the pixels that should be ignored for evaluation. We crop the images and use the yellow box in the original images to highlight the difference in predictions between DCFP and SIRFP.
  • Figure 3: Visualization comparison with DCFP dcfp on ADE20K validation set with 60% FLOPs reduction of Deeplabv3-ResNet50. Black in GT denotes the pixels which should be ignored for evaluation.