Cross-Domain Knowledge Transfer for Underwater Acoustic Classification Using Pre-trained Models
Amirmohammad Mohammadi, Tejashri Kelhe, Davelle Carreiro, Alexandra Van Dine, Joshua Peeples
TL;DR
The paper investigates cross-domain transfer learning for underwater acoustic classification by comparing AudioSet-pretrained PANNs with ImageNet-pretrained TIMMs on the DeepShip passive sonar dataset. Using a consistent preprocessing and augmentation pipeline, it demonstrates that ImageNet-pretrained TIMMs can surpass audio-pretrained models, particularly at various sampling rates, and that data resolution interacts with pre-training to influence performance. Through extensive experiments and interpretability analyses (Grad-CAM), the work highlights the practical potential of cross-domain transfer learning to address data scarcity in UATR and suggests directions for incorporating self-supervised and multi-modal approaches. The findings have implications for deploying efficient, robust underwater classifiers in real-world maritime applications where labeled data are limited.
Abstract
Transfer learning is commonly employed to leverage large, pre-trained models and perform fine-tuning for downstream tasks. The most prevalent pre-trained models are initially trained using ImageNet. However, their ability to generalize can vary across different data modalities. This study compares pre-trained Audio Neural Networks (PANNs) and ImageNet pre-trained models within the context of underwater acoustic target recognition (UATR). It was observed that the ImageNet pre-trained models slightly out-perform pre-trained audio models in passive sonar classification. We also analyzed the impact of audio sampling rates for model pre-training and fine-tuning. This study contributes to transfer learning applications of UATR, illustrating the potential of pre-trained models to address limitations caused by scarce, labeled data in the UATR domain.
