Table of Contents
Fetching ...

Uncertainty-Calibrated Spatiotemporal Field Diffusion with Sparse Supervision

Kevin Valencia, Xihaier Luo, Shinjae Yoo, David Keetae Park

TL;DR

SOLID is a mask-conditioned diffusion framework that learns spatiotemporal dynamics from sparse observations alone, achieving up to an order-of-magnitude improvement in probabilistic error and yielding calibrated uncertainty maps (\r{ho}>0.7) under severe sparsity.

Abstract

Physical fields are typically observed only at sparse, time-varying sensor locations, making forecasting and reconstruction ill-posed and uncertainty-critical. We present SOLID, a mask-conditioned diffusion framework that learns spatiotemporal dynamics from sparse observations alone: training and evaluation use only observed target locations, requiring no dense fields and no pre-imputation. Unlike prior work that trains on dense reanalysis or simulations and only tests under sparsity, SOLID is trained end-to-end with sparse supervision only. SOLID conditions each denoising step on the measured values and their locations, and introduces a dual-masking objective that (i) emphasizes learning in unobserved void regions while (ii) upweights overlap pixels where inputs and targets provide the most reliable anchors. This strict sparse-conditioning pathway enables posterior sampling of full fields consistent with the measurements, achieving up to an order-of-magnitude improvement in probabilistic error and yielding calibrated uncertainty maps (\r{ho} > 0.7) under severe sparsity.

Uncertainty-Calibrated Spatiotemporal Field Diffusion with Sparse Supervision

TL;DR

SOLID is a mask-conditioned diffusion framework that learns spatiotemporal dynamics from sparse observations alone, achieving up to an order-of-magnitude improvement in probabilistic error and yielding calibrated uncertainty maps (\r{ho}>0.7) under severe sparsity.

Abstract

Physical fields are typically observed only at sparse, time-varying sensor locations, making forecasting and reconstruction ill-posed and uncertainty-critical. We present SOLID, a mask-conditioned diffusion framework that learns spatiotemporal dynamics from sparse observations alone: training and evaluation use only observed target locations, requiring no dense fields and no pre-imputation. Unlike prior work that trains on dense reanalysis or simulations and only tests under sparsity, SOLID is trained end-to-end with sparse supervision only. SOLID conditions each denoising step on the measured values and their locations, and introduces a dual-masking objective that (i) emphasizes learning in unobserved void regions while (ii) upweights overlap pixels where inputs and targets provide the most reliable anchors. This strict sparse-conditioning pathway enables posterior sampling of full fields consistent with the measurements, achieving up to an order-of-magnitude improvement in probabilistic error and yielding calibrated uncertainty maps (\r{ho} > 0.7) under severe sparsity.
Paper Structure (38 sections, 15 equations, 9 figures, 6 tables)

This paper contains 38 sections, 15 equations, 9 figures, 6 tables.

Figures (9)

  • Figure 1: Conditional Field Forecasting. This paper develops realistic training scheme with sparse data only which contrasts with prior works assuming fully observed spatiotemporal fields. We evaluate capacities on spatiotemporal field forecasting during the inference learned purely from sparse data.
  • Figure 2: Training the Proposed SOLID. A dual-masking strategy is introduced. Coupled with sparse conditioning, it realizes a simple yet effective spatiotemporal learning capable of field reconstruction robust with sparse-only training data.
  • Figure 3: Qualitative Comparisons on Forecasting. White boxes denote MSE between prediction and target GT, and gray-scale images show corresponding error.
  • Figure 4: Performance and Parameter Efficiency. Navier–Stokes baseline comparisons summarizing (a) Per parameter efficiency and (b) data efficiency (full data vs.10%).
  • Figure 5: AirDelhi: Performance and Uncertainty. (a) Each model is run for ten different seeds, shown as standard deviation. (b) 30-minute air pollution prediction and its uncertainty. Black contours denote available sample positions.
  • ...and 4 more figures