Table of Contents
Fetching ...

DataStorm-EM: Exploration of Alternative Timelines within Continuous-Coupled Simulation Ensembles

Fahim Tasneema Azad, Javier Redondo Anton, Shubhodeep Mitra, Fateh Singh, Hans Behrens, Mao-Lin Li, Bilgehan Arslan, K. Selçuk Candan, Maria Luisa Sapino

TL;DR

DataStorm-EM addresses decision-making under deep uncertainty by providing a framework to explore alternative timelines within large, continuous-coupled simulation ensembles. It complements the DataStorm-FE flow engine by offering provenance tracking, timeline extraction, and visualization over the ensemble graph, with DS-Flow, DS-Actors, and temporal windows formalizing the dataflow and execution. Key contributions include algorithms for extracting maximal, diverse, and consistent timelines from an ensemble graph $G_e$ and tools for causal analysis and visualization, demonstrated in a PanCommunity-inspired use scenario with six interacting models. This approach enables decision-makers to derive interpretable narratives from complex multi-model simulations and to preserve and examine multiple plausible futures, improving planning under uncertainty across critical domains. The practical impact lies in scalable management of ensemble data and actionable timeline insights that inform interventions and policy decisions, supported by the integration with DataStorm-FE's end-to-end workflow ($G_e$, $DS$-Actors) and formal temporal scoping $ω_I$, $ω_O$, and $ ext{Δ}$.$

Abstract

Many socio-economical critical domains (such as sustainability, public health, and disasters) are characterized by highly complex and dynamic systems, requiring data and model-driven simulations to support decision-making. Due to a large number of unknowns, decision-makers usually need to generate ensembles of stochastic scenarios, requiring hundreds or thousands of individual simulation instances, each with different parameter settings corresponding to distinct scenarios, As the number of model parameters increases, the number of potential timelines one can simulate increases exponentially. Consequently, simulation ensembles are inherently sparse, even when they are extremely large. This necessitates a platform for (a) deciding which simulation instances to execute and (b) given a large simulation ensemble, enabling decision-makers to explore the resulting alternative timelines, by extracting and visualizing consistent, yet diverse timelines from continuous-coupled simulation ensembles. In this article, we present DataStorm-EM platform for data- and model-driven simulation ensemble management, optimization, analysis, and exploration, describe underlying challenges and present our solution.

DataStorm-EM: Exploration of Alternative Timelines within Continuous-Coupled Simulation Ensembles

TL;DR

DataStorm-EM addresses decision-making under deep uncertainty by providing a framework to explore alternative timelines within large, continuous-coupled simulation ensembles. It complements the DataStorm-FE flow engine by offering provenance tracking, timeline extraction, and visualization over the ensemble graph, with DS-Flow, DS-Actors, and temporal windows formalizing the dataflow and execution. Key contributions include algorithms for extracting maximal, diverse, and consistent timelines from an ensemble graph and tools for causal analysis and visualization, demonstrated in a PanCommunity-inspired use scenario with six interacting models. This approach enables decision-makers to derive interpretable narratives from complex multi-model simulations and to preserve and examine multiple plausible futures, improving planning under uncertainty across critical domains. The practical impact lies in scalable management of ensemble data and actionable timeline insights that inform interventions and policy decisions, supported by the integration with DataStorm-FE's end-to-end workflow (, -Actors) and formal temporal scoping , , and .$

Abstract

Many socio-economical critical domains (such as sustainability, public health, and disasters) are characterized by highly complex and dynamic systems, requiring data and model-driven simulations to support decision-making. Due to a large number of unknowns, decision-makers usually need to generate ensembles of stochastic scenarios, requiring hundreds or thousands of individual simulation instances, each with different parameter settings corresponding to distinct scenarios, As the number of model parameters increases, the number of potential timelines one can simulate increases exponentially. Consequently, simulation ensembles are inherently sparse, even when they are extremely large. This necessitates a platform for (a) deciding which simulation instances to execute and (b) given a large simulation ensemble, enabling decision-makers to explore the resulting alternative timelines, by extracting and visualizing consistent, yet diverse timelines from continuous-coupled simulation ensembles. In this article, we present DataStorm-EM platform for data- and model-driven simulation ensemble management, optimization, analysis, and exploration, describe underlying challenges and present our solution.
Paper Structure (10 sections, 3 figures)

This paper contains 10 sections, 3 figures.

Figures (3)

  • Figure 1: A sample workflow that takes into account disease models, external impactors (such as weather), differences in local behaviors (such as risk averseness), and intra- and inter- city mixing patterns impacted by these behaviors
  • Figure 2: (a) Provenance captures the complete histories of simulation instances in the ensemble; (b) a timeline on the other hand captures a maximal and consistent (i.e. for each given time instant, $t$, there is at most one simulation instance of each model type) subset
  • Figure 3: DataStorm-EM timeline exploration interface

Theorems & Definitions (7)

  • definition 1: DS-Flow
  • definition 2: Model
  • definition 3: Input Stream
  • definition 4: Output Stream
  • definition 5: Execution Step of a DS-Actor
  • definition 6: Model Simulation Instance
  • definition 7: Ensemble Graph