Table of Contents
Fetching ...

Knowledge Distillation for Feature Extraction in Underwater VSLAM

Jinghe Yang, Mingming Gong, Girish Nair, Jung Hoon Lee, Jason Monty, Ye Pu

TL;DR

A cross-modal knowl-edge distillation framework for training an underwater feature detection and matching network (UFEN) using in-air RGBD data to generate synthetic underwater images based on a physical underwater imaging formation model and employing these as the medium to distil knowledge from a teacher model SuperPoint pretrained on in- air images.

Abstract

In recent years, learning-based feature detection and matching have outperformed manually-designed methods in in-air cases. However, it is challenging to learn the features in the underwater scenario due to the absence of annotated underwater datasets. This paper proposes a cross-modal knowledge distillation framework for training an underwater feature detection and matching network (UFEN). In particular, we use in-air RGBD data to generate synthetic underwater images based on a physical underwater imaging formation model and employ these as the medium to distil knowledge from a teacher model SuperPoint pretrained on in-air images. We embed UFEN into the ORB-SLAM3 framework to replace the ORB feature by introducing an additional binarization layer. To test the effectiveness of our method, we built a new underwater dataset with groundtruth measurements named EASI (https://github.com/Jinghe-mel/UFEN-SLAM), recorded in an indoor water tank for different turbidity levels. The experimental results on the existing dataset and our new dataset demonstrate the effectiveness of our method.

Knowledge Distillation for Feature Extraction in Underwater VSLAM

TL;DR

A cross-modal knowl-edge distillation framework for training an underwater feature detection and matching network (UFEN) using in-air RGBD data to generate synthetic underwater images based on a physical underwater imaging formation model and employing these as the medium to distil knowledge from a teacher model SuperPoint pretrained on in- air images.

Abstract

In recent years, learning-based feature detection and matching have outperformed manually-designed methods in in-air cases. However, it is challenging to learn the features in the underwater scenario due to the absence of annotated underwater datasets. This paper proposes a cross-modal knowledge distillation framework for training an underwater feature detection and matching network (UFEN). In particular, we use in-air RGBD data to generate synthetic underwater images based on a physical underwater imaging formation model and employ these as the medium to distil knowledge from a teacher model SuperPoint pretrained on in-air images. We embed UFEN into the ORB-SLAM3 framework to replace the ORB feature by introducing an additional binarization layer. To test the effectiveness of our method, we built a new underwater dataset with groundtruth measurements named EASI (https://github.com/Jinghe-mel/UFEN-SLAM), recorded in an indoor water tank for different turbidity levels. The experimental results on the existing dataset and our new dataset demonstrate the effectiveness of our method.
Paper Structure (18 sections, 16 equations, 6 figures, 1 table)

This paper contains 18 sections, 16 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: SuperPoint network architecture consists of a shared encoder and two decoders.
  • Figure 2: Overall training framework of the UFEN. It needs the paired in-air and depth image as the training inputs. The final well-trained UFEN generates the UFEN Bag-of-words vocabulary based on the synthetic underwater images.
  • Figure 3: The parameter values $\beta_{\lambda}$ and $K_d(\lambda)$ in Open Ocean Water Types from Type I to Type III.
  • Figure 4: UFEN-SLAM system framework
  • Figure 5: Feature detection evaluation on TURBID dataset. The lines with circle markers are the results of UFEN, others are the results of SuperPoint.
  • ...and 1 more figures