MLMT-CNN for Object Detection and Segmentation in Multi-layer and Multi-spectral Images

Majedaldein Almahasneh; Adeline Paiement; Xianghua Xie; Jean Aboudarham

MLMT-CNN for Object Detection and Segmentation in Multi-layer and Multi-spectral Images

Majedaldein Almahasneh, Adeline Paiement, Xianghua Xie, Jean Aboudarham

TL;DR

The paper addresses localising solar Active Regions from multi-layer, multi-spectral images by introducing MLMT-CNN, a modular multi-layer, multi-task framework with per-band branches and inter-band fusion for simultaneous detection and segmentation. It supports Faster RCNN and U-Net backbones, employing a Multi-Objective Optimisation training regime and recursive weak-label training to cope with sparse annotations. The authors validate the approach on LAD/UAD solar datasets as well as synthetic BraTS-prime and Cloud-38 data, showing notable improvements over single-band baselines and SPOCA, including IoU up to 0.72 for segmentation and F1 up to 0.90 for detection. The work demonstrates that joint analysis across bands and appropriate fusion strategies yield more robust, coherent 3D localisation of ARs, with practical potential for broad multi-modal imaging tasks.

Abstract

Precisely localising solar Active Regions (AR) from multi-spectral images is a challenging but important task in understanding solar activity and its influence on space weather. A main challenge comes from each modality capturing a different location of the 3D objects, as opposed to typical multi-spectral imaging scenarios where all image bands observe the same scene. Thus, we refer to this special multi-spectral scenario as multi-layer. We present a multi-task deep learning framework that exploits the dependencies between image bands to produce 3D AR localisation (segmentation and detection) where different image bands (and physical locations) have their own set of results. Furthermore, to address the difficulty of producing dense AR annotations for training supervised machine learning (ML) algorithms, we adapt a training strategy based on weak labels (i.e. bounding boxes) in a recursive manner. We compare our detection and segmentation stages against baseline approaches for solar image analysis (multi-channel coronal hole detection, SPOCA for ARs) and state-of-the-art deep learning methods (Faster RCNN, U-Net). Additionally, both detection a nd segmentation stages are quantitatively validated on artificially created data of similar spatial configurations made from annotated multi-modal magnetic resonance images. Our framework achieves an average of 0.72 IoU (segmentation) and 0.90 F1 score (detection) across all modalities, comparing to the best performing baseline methods with scores of 0.53 and 0.58, respectively, on the artificial dataset, and 0.84 F1 score in the AR detection task comparing to baseline of 0.82 F1 score. Our segmentation results are qualitatively validated by an expert on real ARs.

MLMT-CNN for Object Detection and Segmentation in Multi-layer and Multi-spectral Images

TL;DR

Abstract

Paper Structure (22 sections, 2 equations, 7 figures, 7 tables)

This paper contains 22 sections, 2 equations, 7 figures, 7 tables.

Introduction
Methodology
MultiLayer-MultiTask (MLMT) framework
Backbone networks
Detection backbone: Faster RCNN
Segmentation backbone: U-Net
MLMT-CNN: Detection stage
MLMT-CNN: Segmentation stage
Experiments
Data
Labelled AR datasets
Weak-BraTS-prime
Weak-Cloud-38
Detection stage
Independent detection on single image bands
...and 7 more sections

Figures (7)

Figure 1: Ground-truth (red) and MLMT-CNN's (green) detection of ARs at three levels of solar activity (left to right: high, medium, low) in randomly selected images from (top to bottom) SOHO/MDI Magnetogram and PM/SH 3934 Å.
Figure 2: Ground-truth (red) and MLMT-CNN (green) and SPOCA’s (white) detection of ARs at three levels of solar activity (left to right: high, medium, low) in randomly selected images from (top to bottom) SOHO/EIT 304 Å, 171 Å, 195 Å, and 284 Å.
Figure 3: MLMT for detection using the Faster RCNN backbone. The '$+$' sign denotes concatenation of the feature maps, or of the lists of region proposals.
Figure 4: MLMT for segmentation using the U-Net backbone. The '$+$' sign denotes fusion of the feature maps. Coloured boxes are convolutional blocks for each branch (band) respectively. Green and red arrows denote max pooling and up sampling operations, respectively. Blue arrows are skip connections, applied to the appropriate channel of the joint feature map for each branch.
Figure 5: Comparison of the detection results over UAD (top) and BraTS-prime (bottom) datasets. Each group of bars represents an imaging modality. Different colors represent different methods.
...and 2 more figures

MLMT-CNN for Object Detection and Segmentation in Multi-layer and Multi-spectral Images

TL;DR

Abstract

MLMT-CNN for Object Detection and Segmentation in Multi-layer and Multi-spectral Images

Authors

TL;DR

Abstract

Table of Contents

Figures (7)