MCSDNet: Mesoscale Convective System Detection Network via Multi-scale Spatiotemporal Information

Jiajun Liang; Baoquan Zhang; Yunming Ye; Xutao Li; Chuyao Luo; Xukai Fu

MCSDNet: Mesoscale Convective System Detection Network via Multi-scale Spatiotemporal Information

Jiajun Liang, Baoquan Zhang, Yunming Ye, Xutao Li, Chuyao Luo, Xukai Fu

TL;DR

This work addresses the challenge of detecting Mesoscale Convective Systems (MCS) in remote sensing image sequences by introducing MCSDNet, an encoder–decoder network that fuses multi-scale spatial features with temporal evolution using a Spatiotemporal Mix Unit and a Dual Spatiotemporal Attention module. It emphasizes multi-frame, multi-scale information to capture MCS life-cycle dynamics and presents MCSRSI, the first large-scale, publicly available multi-frame MCS dataset based on FY-4A visible-channel imagery. The approach demonstrates state-of-the-art performance on MCS detection compared with RSI analysis, semantic segmentation, and video understanding baselines, and ablations validate the contribution of multi-scale context, temporal/spatial attention, and the STMU architecture. The combination of an expandable STMU design and an open dataset provides a practical, scalable framework for robust MCS detection under diverse conditions, with potential to improve weather monitoring and forecasting pipelines.

Abstract

The accurate detection of Mesoscale Convective Systems (MCS) is crucial for meteorological monitoring due to their potential to cause significant destruction through severe weather phenomena such as hail, thunderstorms, and heavy rainfall. However, the existing methods for MCS detection mostly targets on single-frame detection, which just considers the static characteristics and ignores the temporal evolution in the life cycle of MCS. In this paper, we propose a novel encoder-decoder neural network for MCS detection(MCSDNet). MCSDNet has a simple architecture and is easy to expand. Different from the previous models, MCSDNet targets on multi-frames detection and leverages multi-scale spatiotemporal information for the detection of MCS regions in remote sensing imagery(RSI). As far as we know, it is the first work to utilize multi-scale spatiotemporal information to detect MCS regions. Firstly, we design a multi-scale spatiotemporal information module to extract multi-level semantic from different encoder levels, which makes our models can extract more detail spatiotemporal features. Secondly, a Spatiotemporal Mix Unit(STMU) is introduced to MCSDNet to capture both intra-frame features and inter-frame correlations, which is a scalable module and can be replaced by other spatiotemporal module, e.g., CNN, RNN, Transformer and our proposed Dual Spatiotemporal Attention(DSTA). This means that the future works about spatiotemporal modules can be easily integrated to our model. Finally, we present MCSRSI, the first publicly available dataset for multi-frames MCS detection based on visible channel images from the FY-4A satellite. We also conduct several experiments on MCSRSI and find that our proposed MCSDNet achieve the best performance on MCS detection task when comparing to other baseline methods.

MCSDNet: Mesoscale Convective System Detection Network via Multi-scale Spatiotemporal Information

TL;DR

Abstract

Paper Structure (16 sections, 9 equations, 8 figures, 7 tables)

This paper contains 16 sections, 9 equations, 8 figures, 7 tables.

Introduction
Related Work
Mesoscale Convective System Detection Methods
Semantic Segmentation
Video Understanding
Vision Transformer
Methodology
Preliminaries
Overall Framework
Multi-scale Spatiotemporal Information Module
Spatiotemporal Mix Unit
Performance Evaluation
Datasets and Settings
Performance Evaluation
Ablation Studies
...and 1 more sections

Figures (8)

Figure 1: The proportion distribution of MCS regions among the observation data collected in China in 2018.
Figure 2: Architecture of the proposed MCSDNet. A sequence of images is processed in parallel by a shared convolutional encoder. We introduce multi-scale spatiotemporal information by incorporating the feature maps from different encoder levels. At the lowest resolution, several Spatiotemporal Mix Units(STMUs) explore the spatial and temporal dependencies between the different frames. In the decoder, semantic information is transferred from the encoder to decoder to enhance the detection results further.
Figure 3: Multi-scale spatiotemporal information module.
Figure 4: The detail of proposed Dual Spatiotemporal Attention. (a) Architecture; (b) Spatial Attention Module; (c) Temporal Attention Module.
Figure 5: Sequence Normalization which normalize the batch data along $T \times C \times H \times W$ dimension. T and C are concatenated together during normalization.
...and 3 more figures

MCSDNet: Mesoscale Convective System Detection Network via Multi-scale Spatiotemporal Information

TL;DR

Abstract

MCSDNet: Mesoscale Convective System Detection Network via Multi-scale Spatiotemporal Information

Authors

TL;DR

Abstract

Table of Contents

Figures (8)