Table of Contents
Fetching ...

Machine Learning for the Digital Typhoon Dataset: Extensions to Multiple Basins and New Developments in Representations and Tasks

Asanobu Kitamoto, Erwan Dzik, Gaspar Faure

TL;DR

A self-supervised learning framework for representation learning is introduced, and it is shown that an object detection-based model performs better for stronger typhoons, and how machine learning models can generalize across basins and hemispheres is studied.

Abstract

This paper presents the Digital Typhoon Dataset V2, a new version of the longest typhoon satellite image dataset for 40+ years aimed at benchmarking machine learning models for long-term spatio-temporal data. The new addition in Dataset V2 is tropical cyclone data from the southern hemisphere, in addition to the northern hemisphere data in Dataset V1. Having data from two hemispheres allows us to ask new research questions about regional differences across basins and hemispheres. We also discuss new developments in representations and tasks of the dataset. We first introduce a self-supervised learning framework for representation learning. Combined with the LSTM model, we discuss performance on intensity forecasting and extra-tropical transition forecasting tasks. We then propose new tasks, such as the typhoon center estimation task. We show that an object detection-based model performs better for stronger typhoons. Finally, we study how machine learning models can generalize across basins and hemispheres, by training the model on the northern hemisphere data and testing it on the southern hemisphere data. The dataset is publicly available at \url{http://agora.ex.nii.ac.jp/digital-typhoon/dataset/} and \url{https://github.com/kitamoto-lab/digital-typhoon/}.

Machine Learning for the Digital Typhoon Dataset: Extensions to Multiple Basins and New Developments in Representations and Tasks

TL;DR

A self-supervised learning framework for representation learning is introduced, and it is shown that an object detection-based model performs better for stronger typhoons, and how machine learning models can generalize across basins and hemispheres is studied.

Abstract

This paper presents the Digital Typhoon Dataset V2, a new version of the longest typhoon satellite image dataset for 40+ years aimed at benchmarking machine learning models for long-term spatio-temporal data. The new addition in Dataset V2 is tropical cyclone data from the southern hemisphere, in addition to the northern hemisphere data in Dataset V1. Having data from two hemispheres allows us to ask new research questions about regional differences across basins and hemispheres. We also discuss new developments in representations and tasks of the dataset. We first introduce a self-supervised learning framework for representation learning. Combined with the LSTM model, we discuss performance on intensity forecasting and extra-tropical transition forecasting tasks. We then propose new tasks, such as the typhoon center estimation task. We show that an object detection-based model performs better for stronger typhoons. Finally, we study how machine learning models can generalize across basins and hemispheres, by training the model on the northern hemisphere data and testing it on the southern hemisphere data. The dataset is publicly available at \url{http://agora.ex.nii.ac.jp/digital-typhoon/dataset/} and \url{https://github.com/kitamoto-lab/digital-typhoon/}.

Paper Structure

This paper contains 31 sections, 1 equation, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Simplified overview of the forecasting model.
  • Figure 2: Simplified overview of the center estimation pipeline.
  • Figure 3: Mean distance error for the two experiments regarding the typhoon's grades.
  • Figure 4: Mean distance error for the center estimation task with the two test basins.
  • Figure 5: Regression error for the two test basins using ViT.