Table of Contents
Fetching ...

Supervised Contrastive Machine Unlearning of Background Bias in Sonar Image Classification with Fine-Grained Explainable AI

Kamal Basha S, Athira Nambiar

TL;DR

The paper tackles background bias in sonar image classification caused by seafloor cues. It introduces Targeted Contrastive Unlearning (TCU) to explicitly forget seafloor features and Unlearn-to-Explain Sonar Framework (UESF) to quantify and visualize forgetting with differential LIME explanations. Experiments on real and synthetic sonar data show maintained accuracy with improved target recall and reduced background reliance, along with bias-aware interpretability. This work advances robust, explainable sonar AI by coupling targeted unlearning with bias-focused explanations, enabling more reliable deployments in underwater object detection.

Abstract

Acoustic sonar image analysis plays a critical role in object detection and classification, with applications in both civilian and defense domains. Despite the availability of real and synthetic datasets, existing AI models that achieve high accuracy often over-rely on seafloor features, leading to poor generalization. To mitigate this issue, we propose a novel framework that integrates two key modules: (i) a Targeted Contrastive Unlearning (TCU) module, which extends the traditional triplet loss to reduce seafloor-induced background bias and improve generalization, and (ii) the Unlearn to Explain Sonar Framework (UESF), which provides visual insights into what the model has deliberately forgotten while adapting the LIME explainer to generate more faithful and localized attributions for unlearning evaluation. Extensive experiments across both real and synthetic sonar datasets validate our approach, demonstrating significant improvements in unlearning effectiveness, model robustness, and interpretability.

Supervised Contrastive Machine Unlearning of Background Bias in Sonar Image Classification with Fine-Grained Explainable AI

TL;DR

The paper tackles background bias in sonar image classification caused by seafloor cues. It introduces Targeted Contrastive Unlearning (TCU) to explicitly forget seafloor features and Unlearn-to-Explain Sonar Framework (UESF) to quantify and visualize forgetting with differential LIME explanations. Experiments on real and synthetic sonar data show maintained accuracy with improved target recall and reduced background reliance, along with bias-aware interpretability. This work advances robust, explainable sonar AI by coupling targeted unlearning with bias-focused explanations, enabling more reliable deployments in underwater object detection.

Abstract

Acoustic sonar image analysis plays a critical role in object detection and classification, with applications in both civilian and defense domains. Despite the availability of real and synthetic datasets, existing AI models that achieve high accuracy often over-rely on seafloor features, leading to poor generalization. To mitigate this issue, we propose a novel framework that integrates two key modules: (i) a Targeted Contrastive Unlearning (TCU) module, which extends the traditional triplet loss to reduce seafloor-induced background bias and improve generalization, and (ii) the Unlearn to Explain Sonar Framework (UESF), which provides visual insights into what the model has deliberately forgotten while adapting the LIME explainer to generate more faithful and localized attributions for unlearning evaluation. Extensive experiments across both real and synthetic sonar datasets validate our approach, demonstrating significant improvements in unlearning effectiveness, model robustness, and interpretability.

Paper Structure

This paper contains 16 sections, 3 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Proposed architecture integrating contrastive machine unlearning with explainability. The pipeline includes a baseline classifier $f(x)$, contrastive unlearning model $f_u(x)$, and the machine explainer module to visualize and compare saliency maps, ensuring the model effectively forgets background (seafloor) cues while retaining object-specific features.
  • Figure 2: Sample of real sonar images used in this study.
  • Figure 3: Sample of synthetic sonar images from the S3Simulator+ dataset.
  • Figure 4: Confusion matrix showing class-wise prediction accuracy before and after applying the unlearning framework.
  • Figure 5: t-SNE visualizations of feature embeddings after PCA$\rightarrow$t-SNE. (a) Baseline model: seabed samples (purple) are entangled with ship (red), mine (orange), and plane (green), indicating strong seabed bias and reduced class separability. (b) Unlearned model: seabed forms a distinct and isolated cluster, clearly separated from human (blue), mine (orange), ship (red), and plane (green), demonstrating the effectiveness of the unlearning process in mitigating seabed bias and improving class-specific representations (better viewed in colour).
  • ...and 1 more figures