Table of Contents
Fetching ...

An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research

Hamad Almazrouei, Mariam Al Nasseri, Maha Alzaabi

TL;DR

The paper tackles the challenge of ocean exploration by presenting an AI-powered autonomous underwater vehicle (AUV) system that automates detection, clustering, and reporting of underwater objects. It fuses YOLOv12 Nano for real-time detection, ResNet50 for feature extraction, PCA for dimensionality reduction, K-means++ for clustering, and a GPT-4o Mini LLM for structured reporting, validated on merged DeepFish and OzFish datasets. The approach achieves a mAP@0.5 of 0.512, precision 0.535, recall 0.437, preserves 98% variance with 199 PCA components, and forms 27 clusters from 687 crops, with LLM-generated summaries enhanced by location data. This work demonstrates a safer, faster, and more scalable pipeline for underwater research, enabling automated analysis and reporting in challenging marine environments.

Abstract

Traditional sea exploration faces significant challenges due to extreme conditions, limited visibility, and high costs, resulting in vast unexplored ocean regions. This paper presents an innovative AI-powered Autonomous Underwater Vehicle (AUV) system designed to overcome these limitations by automating underwater object detection, analysis, and reporting. The system integrates YOLOv12 Nano for real-time object detection, a Convolutional Neural Network (CNN) (ResNet50) for feature extraction, Principal Component Analysis (PCA) for dimensionality reduction, and K-Means++ clustering for grouping marine objects based on visual characteristics. Furthermore, a Large Language Model (LLM) (GPT-4o Mini) is employed to generate structured reports and summaries of underwater findings, enhancing data interpretation. The system was trained and evaluated on a combined dataset of over 55,000 images from the DeepFish and OzFish datasets, capturing diverse Australian marine environments. Experimental results demonstrate the system's capability to detect marine objects with a mAP@0.5 of 0.512, a precision of 0.535, and a recall of 0.438. The integration of PCA effectively reduced feature dimensionality while preserving 98% variance, facilitating K-Means clustering which successfully grouped detected objects based on visual similarities. The LLM integration proved effective in generating insightful summaries of detections and clusters, supported by location data. This integrated approach significantly reduces the risks associated with human diving, increases mission efficiency, and enhances the speed and depth of underwater data analysis, paving the way for more effective scientific research and discovery in challenging marine environments.

An AI-Powered Autonomous Underwater System for Sea Exploration and Scientific Research

TL;DR

The paper tackles the challenge of ocean exploration by presenting an AI-powered autonomous underwater vehicle (AUV) system that automates detection, clustering, and reporting of underwater objects. It fuses YOLOv12 Nano for real-time detection, ResNet50 for feature extraction, PCA for dimensionality reduction, K-means++ for clustering, and a GPT-4o Mini LLM for structured reporting, validated on merged DeepFish and OzFish datasets. The approach achieves a mAP@0.5 of 0.512, precision 0.535, recall 0.437, preserves 98% variance with 199 PCA components, and forms 27 clusters from 687 crops, with LLM-generated summaries enhanced by location data. This work demonstrates a safer, faster, and more scalable pipeline for underwater research, enabling automated analysis and reporting in challenging marine environments.

Abstract

Traditional sea exploration faces significant challenges due to extreme conditions, limited visibility, and high costs, resulting in vast unexplored ocean regions. This paper presents an innovative AI-powered Autonomous Underwater Vehicle (AUV) system designed to overcome these limitations by automating underwater object detection, analysis, and reporting. The system integrates YOLOv12 Nano for real-time object detection, a Convolutional Neural Network (CNN) (ResNet50) for feature extraction, Principal Component Analysis (PCA) for dimensionality reduction, and K-Means++ clustering for grouping marine objects based on visual characteristics. Furthermore, a Large Language Model (LLM) (GPT-4o Mini) is employed to generate structured reports and summaries of underwater findings, enhancing data interpretation. The system was trained and evaluated on a combined dataset of over 55,000 images from the DeepFish and OzFish datasets, capturing diverse Australian marine environments. Experimental results demonstrate the system's capability to detect marine objects with a mAP@0.5 of 0.512, a precision of 0.535, and a recall of 0.438. The integration of PCA effectively reduced feature dimensionality while preserving 98% variance, facilitating K-Means clustering which successfully grouped detected objects based on visual similarities. The LLM integration proved effective in generating insightful summaries of detections and clusters, supported by location data. This integrated approach significantly reduces the risks associated with human diving, increases mission efficiency, and enhances the speed and depth of underwater data analysis, paving the way for more effective scientific research and discovery in challenging marine environments.

Paper Structure

This paper contains 32 sections, 7 figures, 5 tables.

Figures (7)

  • Figure 1: System Architecture and Data Pipeline Overview: A detailed overview of the proposed architecture and pipeline, showing different system modules, and their data flow.
  • Figure 2: Image Samples From Both DeepFish and OzFish Datasets
  • Figure 3: Dataset Building Pipeline
  • Figure 4: Distribution of Dataset Instances and Bounding Box Properties
  • Figure 5: Visualization of Test Crop Assignments within PCA-Reduced Feature Space Clusters
  • ...and 2 more figures