DataStorm-EM: Exploration of Alternative Timelines within Continuous-Coupled Simulation Ensembles
Fahim Tasneema Azad, Javier Redondo Anton, Shubhodeep Mitra, Fateh Singh, Hans Behrens, Mao-Lin Li, Bilgehan Arslan, K. Selçuk Candan, Maria Luisa Sapino
TL;DR
DataStorm-EM addresses decision-making under deep uncertainty by providing a framework to explore alternative timelines within large, continuous-coupled simulation ensembles. It complements the DataStorm-FE flow engine by offering provenance tracking, timeline extraction, and visualization over the ensemble graph, with DS-Flow, DS-Actors, and temporal windows formalizing the dataflow and execution. Key contributions include algorithms for extracting maximal, diverse, and consistent timelines from an ensemble graph $G_e$ and tools for causal analysis and visualization, demonstrated in a PanCommunity-inspired use scenario with six interacting models. This approach enables decision-makers to derive interpretable narratives from complex multi-model simulations and to preserve and examine multiple plausible futures, improving planning under uncertainty across critical domains. The practical impact lies in scalable management of ensemble data and actionable timeline insights that inform interventions and policy decisions, supported by the integration with DataStorm-FE's end-to-end workflow ($G_e$, $DS$-Actors) and formal temporal scoping $ω_I$, $ω_O$, and $ ext{Δ}$.$
Abstract
Many socio-economical critical domains (such as sustainability, public health, and disasters) are characterized by highly complex and dynamic systems, requiring data and model-driven simulations to support decision-making. Due to a large number of unknowns, decision-makers usually need to generate ensembles of stochastic scenarios, requiring hundreds or thousands of individual simulation instances, each with different parameter settings corresponding to distinct scenarios, As the number of model parameters increases, the number of potential timelines one can simulate increases exponentially. Consequently, simulation ensembles are inherently sparse, even when they are extremely large. This necessitates a platform for (a) deciding which simulation instances to execute and (b) given a large simulation ensemble, enabling decision-makers to explore the resulting alternative timelines, by extracting and visualizing consistent, yet diverse timelines from continuous-coupled simulation ensembles. In this article, we present DataStorm-EM platform for data- and model-driven simulation ensemble management, optimization, analysis, and exploration, describe underlying challenges and present our solution.
