Table of Contents
Fetching ...

A Real-time Multimodal Transformer Neural Network-powered Wildfire Forecasting System

Qijun Chen, Shaofan Li

TL;DR

The paper tackles the challenge of precise, real-time wildfire forecasting at fine spatial scales by introducing a real-time multimodal transformer framework that fuses large-scale weather forecasts with local terrain, vegetation, and Google Earth imagery. The approach combines a Wildfire Imagery Information Net (WIIN) built from a ResNet backbone and a Vision Transformer (WIT) with a Baseline multi-feature model in a Hybrid Multimodal Model, trained on US wildfire data from 1992–2015 to predict probabilities at $100\,\mathrm{m}^2$ cells up to $24\,\mathrm{h}$ ahead. Key contributions include the integration of image-derived features with information data, explicit handling of data imbalance via undersampling, and a comprehensive evaluation of TPR/TNR trade-offs, with a Camp Fire case study demonstrating practical forecasting potential. The findings show that balanced models can achieve around $85\%$ accuracy with comparable TPR and TNR, underscoring the method’s potential to support targeted wildfire prevention and resource allocation, while highlighting limitations related to data quality, scale, and generalizability to non-US regions.

Abstract

Due to climate change, the extreme wildfire has become one of the most dangerous natural hazards to human civilization. Even though, some wildfires may be initially caused by human activity, but the spread of wildfires is mainly determined by environmental factors, for examples, (1) weather conditions such as temperature, wind direction and intensity, and moisture levels; (2) the amount and types of dry vegetation in a local area, and (3) topographic or local terrian conditions, which affects how much rain an area gets and how fire dynamics will be constrained or faciliated. Thus, to accurately forecast wildfire occurrence has become one of most urgent and taunting environmental challenges in global scale. In this work, we developed a real-time Multimodal Transformer Neural Network Machine Learning model that combines several advanced artificial intelligence techniques and statistical methods to practically forecast the occurrence of wildfire at the precise location in real time, which not only utilizes large scale data information such as hourly weather forecasting data, but also takes into account small scale topographical data such as local terrain condition and local vegetation conditions collecting from Google Earth images to determine the probabilities of wildfire occurrence location at small scale as well as their timing synchronized with weather forecast information. By using the wildfire data in the United States from 1992 to 2015 to train the multimodal transformer neural network, it can predict the probabilities of wildfire occurrence according to the real-time weather forecast and the synchronized Google Earth image data to provide the wildfire occurrence probability in any small location ($100m^2$) within 24 hours ahead.

A Real-time Multimodal Transformer Neural Network-powered Wildfire Forecasting System

TL;DR

The paper tackles the challenge of precise, real-time wildfire forecasting at fine spatial scales by introducing a real-time multimodal transformer framework that fuses large-scale weather forecasts with local terrain, vegetation, and Google Earth imagery. The approach combines a Wildfire Imagery Information Net (WIIN) built from a ResNet backbone and a Vision Transformer (WIT) with a Baseline multi-feature model in a Hybrid Multimodal Model, trained on US wildfire data from 1992–2015 to predict probabilities at cells up to ahead. Key contributions include the integration of image-derived features with information data, explicit handling of data imbalance via undersampling, and a comprehensive evaluation of TPR/TNR trade-offs, with a Camp Fire case study demonstrating practical forecasting potential. The findings show that balanced models can achieve around accuracy with comparable TPR and TNR, underscoring the method’s potential to support targeted wildfire prevention and resource allocation, while highlighting limitations related to data quality, scale, and generalizability to non-US regions.

Abstract

Due to climate change, the extreme wildfire has become one of the most dangerous natural hazards to human civilization. Even though, some wildfires may be initially caused by human activity, but the spread of wildfires is mainly determined by environmental factors, for examples, (1) weather conditions such as temperature, wind direction and intensity, and moisture levels; (2) the amount and types of dry vegetation in a local area, and (3) topographic or local terrian conditions, which affects how much rain an area gets and how fire dynamics will be constrained or faciliated. Thus, to accurately forecast wildfire occurrence has become one of most urgent and taunting environmental challenges in global scale. In this work, we developed a real-time Multimodal Transformer Neural Network Machine Learning model that combines several advanced artificial intelligence techniques and statistical methods to practically forecast the occurrence of wildfire at the precise location in real time, which not only utilizes large scale data information such as hourly weather forecasting data, but also takes into account small scale topographical data such as local terrain condition and local vegetation conditions collecting from Google Earth images to determine the probabilities of wildfire occurrence location at small scale as well as their timing synchronized with weather forecast information. By using the wildfire data in the United States from 1992 to 2015 to train the multimodal transformer neural network, it can predict the probabilities of wildfire occurrence according to the real-time weather forecast and the synchronized Google Earth image data to provide the wildfire occurrence probability in any small location () within 24 hours ahead.

Paper Structure

This paper contains 22 sections, 17 equations, 28 figures, 11 tables, 1 algorithm.

Figures (28)

  • Figure 1: (a) 2018 California Camp Fire [Photo: U.S. Department of Agriculture/Wikimedia Commons], and (b) Geographical locations of wildfires occurred in the United States
  • Figure 2: Data Components
  • Figure 3: An example of image file from Google Map @35.7044,-118.5883333, on 8/18/2013
  • Figure 4: Images randomly chosen from the image dataset, and the indices of the images is on top, each image is in $100\ pixels \times 100\ pixels$ (Height $\times$ Width)
  • Figure 5: Data split detail, Total 29550, naturally caused 4717, non-natural caused 8490, train size 10565, test size 2642, total test 18985
  • ...and 23 more figures