Table of Contents
Fetching ...

Adaptive Sensing of Continuous Physical Systems for Machine Learning

Felix Köster, Atsushi Uchida

TL;DR

This work proposes a general computing framework for adaptive information extraction from dynamical systems, in which a trainable attention module learns both where to probe the system state and how to combine these measurements to optimize prediction performance.

Abstract

Physical dynamical systems can be viewed as natural information processors: their systems preserve, transform, and disperse input information. This perspective motivates learning not only from data generated by such systems, but also how to measure them in a way that extracts the most useful information for a given task. We propose a general computing framework for adaptive information extraction from dynamical systems, in which a trainable attention module learns both where to probe the system state and how to combine these measurements to optimize prediction performance. As a concrete instantiation, we implement this idea using a spatiotemporal field governed by a partial differential equation as the underlying dynamics, though the framework applies equally to any system whose state can be sampled. Our results show that adaptive spatial sensing significantly improves prediction accuracy on canonical chaotic benchmarks. This work provides a perspective on attention-enhanced reservoir computing as a special case of a broader paradigm: neural networks as trainable measurement devices for extracting information from physical dynamical systems.

Adaptive Sensing of Continuous Physical Systems for Machine Learning

TL;DR

This work proposes a general computing framework for adaptive information extraction from dynamical systems, in which a trainable attention module learns both where to probe the system state and how to combine these measurements to optimize prediction performance.

Abstract

Physical dynamical systems can be viewed as natural information processors: their systems preserve, transform, and disperse input information. This perspective motivates learning not only from data generated by such systems, but also how to measure them in a way that extracts the most useful information for a given task. We propose a general computing framework for adaptive information extraction from dynamical systems, in which a trainable attention module learns both where to probe the system state and how to combine these measurements to optimize prediction performance. As a concrete instantiation, we implement this idea using a spatiotemporal field governed by a partial differential equation as the underlying dynamics, though the framework applies equally to any system whose state can be sampled. Our results show that adaptive spatial sensing significantly improves prediction accuracy on canonical chaotic benchmarks. This work provides a perspective on attention-enhanced reservoir computing as a special case of a broader paradigm: neural networks as trainable measurement devices for extracting information from physical dynamical systems.
Paper Structure (19 sections, 22 equations, 8 figures, 2 tables)

This paper contains 19 sections, 22 equations, 8 figures, 2 tables.

Figures (8)

  • Figure 1: Flow chart of the Adaptive-Sensing-Attention–Enhanced-Reservoir-Computer (ASAERC). Inputs drive a fixed continuous reservoir. The reservoir is measured at fixed locations given via the measurement kernels $\psi_t^{(j)}(\mathbf{x})$ and the resulting $N_\mathrm{fix}$ states $\tilde{r}_t$ are fed into the attention module. The trainable attention module outputs parameterized kernels $\phi_{t+T}^{(i)}(\mathbf{x})$ to guide sampling and attention weights $\mathbf{W}_{\mathrm{att},t+T}$ to combine sampled features. The weighted combination produces $\bar{\mathbf{y}}_n$, and gradient descent (dashed path) updates the attention module. No gradients pass through the PDE reservoir.
  • Figure 2: Illustration of ASAERC behavior. (a) Example predicted time series (orange-dashed line) compared with ground truth (blue-solid line) on the Lorenz attractor. (b) Snapshot of the PDE reservoir field $u(\mathbf{x},t^\star)$ with fixed injection points (light-blue triangles), fixed measurement points (attention input, crosses) $\tilde{r}_t^{(j)} = \int_{\Omega} \psi_t^{(j)}(\mathbf{x})\, u(\mathbf{x},t)\, d\mathbf{x},$ and dynamic measurement locations (attention output) overlaid $r_{t+T}^{(j)} = \int_{\Omega} \phi_{t+T}^{(i)}(\mathbf{x})\, u(\mathbf{x},t+T)\, d\mathbf{x},$. The adaptive sensors (blue circles) cluster around dynamically active regions, demonstrating the network’s ability to focus its attention on informative spatial locations. (c) A histogram of the attention output locations, giving an intricate pattern that is a nonlinear projection of the Lorenz attractor on the measurement process.
  • Figure 3: Prediction error as a function of number of measurement points. All methods use the same PDE reservoir. (a) Classical reservoir computing with linear readout; (b) AERC with fixed measurement positions; (c) ASAERC with adaptive sensing. Color indicates the number of measurement points at fixd locations.
  • Figure 4: Number of trainable parameters versus number of measurement points. The dotted line with triangles denotes the linear readout, the dashed line with squares denotes AERC, and the solid lines with dots denote ASAERC. For both AERC and ASAERC, we show three cases: one with 16, 64, and 256 adaptive queries. All three methods use the same PDE reservoir. ASAERC has the largest number of parameters, however only slightly more than AERC. Linear readout has far fewer parameters.
  • Figure 5: Correlation analysis of readout contributions. Distribution of pairwise Pearson correlation coefficients $\rho_{ij}$ between readout nodes $i$ and $j$ for (left column) raw PDE values, (middle) attention weights, and (right) their product. Rows correspond to (top row) linear readout, (middle row) AERC, and (bottom row) ASAERC ASAERC exhibits substantially lower correlation than linear readout and AERC, indicating that its measurements are less redundant and more complementary, which likely aids generalization and lowers prediction error.
  • ...and 3 more figures