Table of Contents
Fetching ...

Transferability of labels between multilens cameras

Ignacio de Loyola Páez-Ubieta, Daniel Frau-Alfaro, Santiago T. Puente

TL;DR

This work tackles automatic transfer of bounding box and mask labels across the multiple lenses of a multilens camera to enable labeling in multispectral data without task-specific training. It introduces a two-step method: first estimate a 2D transformation between lens images via phase correlation in the frequency domain, then refine this transform with a multi-scale search that maximizes the IoU between transformed and reference labels. On a MicaSense RedEdge-MX Dual system, the method achieves high transfer accuracy (≈$97\%$ for BB and ≈$94\%$ for masks) with processing times around $58$–$72\ \mathrm{ms}$, and demonstrates successful RGB label transfer by generating fake RGB views from the multispectral bands. This enables labeling objects that may be invisible in some spectra, broadening the practical use of multispectral cameras, with future work aimed at applying the approach to more cameras, leveraging all lenses, and integrating a true RGB channel to further reduce error.

Abstract

In this work, a new method for automatically extending Bounding Box (BB) and mask labels across different channels on multilens cameras is presented. For that purpose, the proposed method combines the well known phase correlation method with a refinement process. During the first step, images are aligned by localizing the peak of intensity obtained in the spatial domain after performing the cross correlation process in the frequency domain. The second step consists of obtaining the best possible transformation by using an iterative process maximising the IoU (Intersection over Union) metric. Results show that, by using this method, labels could be transferred across different lens on a camera with an accuracy over 90% in most cases and just by using 65 ms in the whole process. Once the transformations are obtained, artificial RGB images are generated, for labeling them so as to transfer this information into each of the other lens. This work will allow users to use this type of cameras in more fields rather than satellite or medical imagery, giving the chance of labeling even invisible objects in the visible spectrum.

Transferability of labels between multilens cameras

TL;DR

This work tackles automatic transfer of bounding box and mask labels across the multiple lenses of a multilens camera to enable labeling in multispectral data without task-specific training. It introduces a two-step method: first estimate a 2D transformation between lens images via phase correlation in the frequency domain, then refine this transform with a multi-scale search that maximizes the IoU between transformed and reference labels. On a MicaSense RedEdge-MX Dual system, the method achieves high transfer accuracy (≈ for BB and ≈ for masks) with processing times around , and demonstrates successful RGB label transfer by generating fake RGB views from the multispectral bands. This enables labeling objects that may be invisible in some spectra, broadening the practical use of multispectral cameras, with future work aimed at applying the approach to more cameras, leveraging all lenses, and integrating a true RGB channel to further reduce error.

Abstract

In this work, a new method for automatically extending Bounding Box (BB) and mask labels across different channels on multilens cameras is presented. For that purpose, the proposed method combines the well known phase correlation method with a refinement process. During the first step, images are aligned by localizing the peak of intensity obtained in the spatial domain after performing the cross correlation process in the frequency domain. The second step consists of obtaining the best possible transformation by using an iterative process maximising the IoU (Intersection over Union) metric. Results show that, by using this method, labels could be transferred across different lens on a camera with an accuracy over 90% in most cases and just by using 65 ms in the whole process. Once the transformations are obtained, artificial RGB images are generated, for labeling them so as to transfer this information into each of the other lens. This work will allow users to use this type of cameras in more fields rather than satellite or medical imagery, giving the chance of labeling even invisible objects in the visible spectrum.
Paper Structure (9 sections, 10 equations, 6 figures, 4 tables)

This paper contains 9 sections, 10 equations, 6 figures, 4 tables.

Figures (6)

  • Figure 1: Hardware used during the experiments.
  • Figure 2: labeled experiment: (a) reference image (band 5) to start with, (b, c, d, e) transformed labels (bands 1-4) and (f, g, h, i) ground truth labels for comparison purposes (bands 1-4).
  • Figure 3: Mask labeled experiment: (a) reference image (band 5) to start with, (b, c, d, e) transformed labels (bands 1-4) and (f, g, h, i) ground truth labels for comparison purposes (bands 1-4).
  • Figure 4: Combination of 3 channels to create fake RGB image: (a,b,c) bands 1, 2, 3 respectively, (d) Generated fake RGB image.
  • Figure 5: BB labeled fake RGB image and transfered labels: (a) Fake RGB image with labels, (b) labels transfered to band 1, (c) labels transfered to band 2, (d) labels transfered to band 3, (e) labels transfered to band 4 and (f) labels transfered to band 5.
  • ...and 1 more figures