Table of Contents
Fetching ...

PhysFire-WM: A Physics-Informed World Model for Emulating Fire Spread Dynamics

Nan Zhou, Huandong Wang, Jiahao Li, Yang Li, Xiao-Ping Zhang, Yong Li, Xinlei Chen

TL;DR

PhysFire-WM introduces a physics-informed world model for fire spread forecasting by embedding a PDE-based physical prior via a Physical Simulator into a diffusion Transformer framework, and by employing a Cross-task Collaborative Training (CC-Train) to fuse infrared and mask modalities. The approach jointly enforces combustion dynamics and boundary geometry, achieving state-of-the-art performance on a fine-grained multimodal fire dataset. Ablation studies confirm that physical priors and cross-task collaboration are pivotal for both physical plausibility and predictive accuracy. The work demonstrates the practicality of physics-informed world models for disaster prediction and provides a scalable blueprint for multimodal, physics-constrained forecasting.

Abstract

Fine-grained fire prediction plays a crucial role in emergency response. Infrared images and fire masks provide complementary thermal and boundary information, yet current methods are predominantly limited to binary mask modeling with inherent signal sparsity, failing to capture the complex dynamics of fire. While world models show promise in video generation, their physical inconsistencies pose significant challenges for fire forecasting. This paper introduces PhysFire-WM, a Physics-informed World Model for emulating Fire spread dynamics. Our approach internalizes combustion dynamics by encoding structured priors from a Physical Simulator to rectify physical discrepancies, coupled with a Cross-task Collaborative Training strategy (CC-Train) that alleviates the issue of limited information in mask-based modeling. Through parameter sharing and gradient coordination, CC-Train effectively integrates thermal radiation dynamics and spatial boundary delineation, enhancing both physical realism and geometric accuracy. Extensive experiments on a fine-grained multimodal fire dataset demonstrate the superior accuracy of PhysFire-WM in fire spread prediction. Validation underscores the importance of physical priors and cross-task collaboration, providing new insights for applying physics-informed world models to disaster prediction.

PhysFire-WM: A Physics-Informed World Model for Emulating Fire Spread Dynamics

TL;DR

PhysFire-WM introduces a physics-informed world model for fire spread forecasting by embedding a PDE-based physical prior via a Physical Simulator into a diffusion Transformer framework, and by employing a Cross-task Collaborative Training (CC-Train) to fuse infrared and mask modalities. The approach jointly enforces combustion dynamics and boundary geometry, achieving state-of-the-art performance on a fine-grained multimodal fire dataset. Ablation studies confirm that physical priors and cross-task collaboration are pivotal for both physical plausibility and predictive accuracy. The work demonstrates the practicality of physics-informed world models for disaster prediction and provides a scalable blueprint for multimodal, physics-constrained forecasting.

Abstract

Fine-grained fire prediction plays a crucial role in emergency response. Infrared images and fire masks provide complementary thermal and boundary information, yet current methods are predominantly limited to binary mask modeling with inherent signal sparsity, failing to capture the complex dynamics of fire. While world models show promise in video generation, their physical inconsistencies pose significant challenges for fire forecasting. This paper introduces PhysFire-WM, a Physics-informed World Model for emulating Fire spread dynamics. Our approach internalizes combustion dynamics by encoding structured priors from a Physical Simulator to rectify physical discrepancies, coupled with a Cross-task Collaborative Training strategy (CC-Train) that alleviates the issue of limited information in mask-based modeling. Through parameter sharing and gradient coordination, CC-Train effectively integrates thermal radiation dynamics and spatial boundary delineation, enhancing both physical realism and geometric accuracy. Extensive experiments on a fine-grained multimodal fire dataset demonstrate the superior accuracy of PhysFire-WM in fire spread prediction. Validation underscores the importance of physical priors and cross-task collaboration, providing new insights for applying physics-informed world models to disaster prediction.

Paper Structure

This paper contains 26 sections, 25 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Fire spread modeling via a physics-informed world model. Task 1: Infrared modality prediction. Task 2: Mask modality prediction. “Env. Info.” denotes environmental information.
  • Figure 2: Overview of PhysFire-WM. The pipeline comprises: physical prior generation from the Physical Simulator; unified spatiotemporal tokens production through the Multimodal Tokenizer; joint optimization of infrared and mask prediction via Cross-task Collaborative Training.
  • Figure 3: Components of PhysFire-WM. (a) The Physical Simulator derives physical prior knowledge from PDEs. (b) The Multimodal Tokenizer unifies multimodal inputs into spatiotemporally consistent tokens.
  • Figure 4: Model performance is evaluated across multiple regions, including both seen (training and test sets) and unseen (test set) regions.
  • Figure 5: Visualization of Prediction Results. The enlarged view in the upper-right corner highlights the main fire spread region. (a) Mask modality prediction. (b) Infrared modality prediction.