Table of Contents
Fetching ...

LEMMA: Laplacian pyramids for Efficient Marine SeMAntic Segmentation

Ishaan Gakhar, Laven Srivastava, Sankarshanaa Sagaram, Aditya Kasliwal, Ujjwal Verma

Abstract

Semantic segmentation in marine environments is crucial for the autonomous navigation of unmanned surface vessels (USVs) and coastal Earth Observation events such as oil spills. However, existing methods, often relying on deep CNNs and transformer-based architectures, face challenges in deployment due to their high computational costs and resource-intensive nature. These limitations hinder the practicality of real-time, low-cost applications in real-world marine settings. To address this, we propose LEMMA, a lightweight semantic segmentation model designed specifically for accurate remote sensing segmentation under resource constraints. The proposed architecture leverages Laplacian Pyramids to enhance edge recognition, a critical component for effective feature extraction in complex marine environments for disaster response, environmental surveillance, and coastal monitoring. By integrating edge information early in the feature extraction process, LEMMA eliminates the need for computationally expensive feature map computations in deeper network layers, drastically reducing model size, complexity and inference time. LEMMA demonstrates state-of-the-art performance across datasets captured from diverse platforms while reducing trainable parameters and computational requirements by up to 71x, GFLOPs by up to 88.5\%, and inference time by up to 84.65\%, as compared to existing models. Experimental results highlight its effectiveness and real-world applicability, including 93.42\% IoU on the Oil Spill dataset and 98.97\% mIoU on Mastr1325.

LEMMA: Laplacian pyramids for Efficient Marine SeMAntic Segmentation

Abstract

Semantic segmentation in marine environments is crucial for the autonomous navigation of unmanned surface vessels (USVs) and coastal Earth Observation events such as oil spills. However, existing methods, often relying on deep CNNs and transformer-based architectures, face challenges in deployment due to their high computational costs and resource-intensive nature. These limitations hinder the practicality of real-time, low-cost applications in real-world marine settings. To address this, we propose LEMMA, a lightweight semantic segmentation model designed specifically for accurate remote sensing segmentation under resource constraints. The proposed architecture leverages Laplacian Pyramids to enhance edge recognition, a critical component for effective feature extraction in complex marine environments for disaster response, environmental surveillance, and coastal monitoring. By integrating edge information early in the feature extraction process, LEMMA eliminates the need for computationally expensive feature map computations in deeper network layers, drastically reducing model size, complexity and inference time. LEMMA demonstrates state-of-the-art performance across datasets captured from diverse platforms while reducing trainable parameters and computational requirements by up to 71x, GFLOPs by up to 88.5\%, and inference time by up to 84.65\%, as compared to existing models. Experimental results highlight its effectiveness and real-world applicability, including 93.42\% IoU on the Oil Spill dataset and 98.97\% mIoU on Mastr1325.

Paper Structure

This paper contains 12 sections, 3 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: Visualization of the Laplacian pyramid of depth 2 of an RGB image. Vital spatial and structural information is present at each level, notably in the edges available at different resolutions. Here, $L_1$, $L_2$ and Residual ($L_3$) refer to the first, second and last layer of the decomposed Laplacian Pyramid.
  • Figure 2: A schematic overview of the proposed model-LEMMA. The sections highlighted in green, blue, and yellow symbolize the LFB, MFB, and HFB, respectively. $L_1$, $L_2$, and Residual ($L_3$) represent the three layers of the decomposed Laplacian pyramid. The 'nc' for each dataset is the total number of classes in the dataset. Each residual block chain has the corresponding number of blocks, as explained in Section \ref{['sec: Method']}.
  • Figure 3: Visualization of the qualitative outputs on the Mastr1325 Dataset.
  • Figure 4: Visualization of the qualitative outputs on the Oil Spill Drone Dataset.