Table of Contents
Fetching ...

Koopman Operator Identification of Model Parameter Trajectories for Temporal Domain Generalization (KOMET)

Randy C. Hoover, Jacob James, Paul May, Kyle Caudle

Abstract

Parametric models deployed in non-stationary environments degrade as the underlying data distribution evolves over time (a phenomenon known as temporal domain drift). In the current work, we present KOMET (Koopman Operator identification of Model parameter Evolution under Temporal drift), a model-agnostic, data-driven framework that treats the sequence of trained parameter vectors as the trajectory of a nonlinear dynamical system and identifies its governing linear operator via Extended Dynamic Mode Decomposition (EDMD). A warm-start sequential training protocol enforces parameter-trajectory smoothness, and a Fourier-augmented observable dictionary exploits the periodic structure inherent in many real-world distribution drifts. Once identified, KOMET's Koopman operator predicts future parameter trajectories autonomously, without access to future labeled data, enabling zero-retraining adaptation at deployment. Evaluated on six datasets spanning rotating, oscillating, and expanding distribution geometries, KOMET achieves mean autonomous-rollout accuracies between 0.981 and 1.000 over 100 held-out time steps. Spectral and coupling analyses further reveal interpretable dynamical structure consistent with the geometry of the drifting decision boundary.

Koopman Operator Identification of Model Parameter Trajectories for Temporal Domain Generalization (KOMET)

Abstract

Parametric models deployed in non-stationary environments degrade as the underlying data distribution evolves over time (a phenomenon known as temporal domain drift). In the current work, we present KOMET (Koopman Operator identification of Model parameter Evolution under Temporal drift), a model-agnostic, data-driven framework that treats the sequence of trained parameter vectors as the trajectory of a nonlinear dynamical system and identifies its governing linear operator via Extended Dynamic Mode Decomposition (EDMD). A warm-start sequential training protocol enforces parameter-trajectory smoothness, and a Fourier-augmented observable dictionary exploits the periodic structure inherent in many real-world distribution drifts. Once identified, KOMET's Koopman operator predicts future parameter trajectories autonomously, without access to future labeled data, enabling zero-retraining adaptation at deployment. Evaluated on six datasets spanning rotating, oscillating, and expanding distribution geometries, KOMET achieves mean autonomous-rollout accuracies between 0.981 and 1.000 over 100 held-out time steps. Spectral and coupling analyses further reveal interpretable dynamical structure consistent with the geometry of the drifting decision boundary.

Paper Structure

This paper contains 28 sections, 11 equations, 4 figures, 1 table.

Figures (4)

  • Figure 2: KOMET two-phase pipeline. Phase 1: warm-start sequential training accumulates a smooth weight trajectory $\mathbf{W}$ by carrying Adam moment state $(\bm\theta,\mathbf{m},\mathbf{v})$ across timesteps. Phase 2: EDMD identifies a linear Koopman operator $\mathbf{A}$ on the observed trajectory; spectral analysis extracts drift structure; autonomous rollout predicts future weights with zero retraining.
  • Figure 3: Representative dataset snapshots at $t=0$ (Left col: training reference) and $t \in \{300,325,362,387\}$ (Right cols: autonomous predictions with zero retraining). The learned decision boundaries for each of the two datasets are also shown across all five representative snapshots. Top row: Dataset C (Sensor Drift / Lissajous). Two binary classes (fixed separation $d=1.2$) ride a Lissajous orbit $\delta(t)=\bigl[1.8\sin(2\pi t/T),\,\cos(2\pi t/T)\bigr]$ where task difficulty is constant while the input distribution drifts continuously. Dashed ellipses trace the centroid paths over one full period. Bottom row: Dataset E (3-Class Orbiting MoG). Three Gaussian-mixtures sit at vertices of a time-varying rotating equilateral triangle whose circumradius oscillates $R(t)=1.5+0.3\cos(2\pi t/T)$ where the inter-class gap ranges from $\sqrt{3}R_{\min}\approx 2.08$ (hard, $t=362$) to $\sqrt{3}R_{\max}\approx 3.11$ (easy, $t=300$). Note: At $t=325$, the Lissajous offset in Dataset C places both classes in the far-right quadrant; the small number of blue points visible in the red region reflect inherent distributional overlap at this phase, consistent with near-Bayes-optimal classification.
  • Figure 4: Distance correlation matrix for Dataset B ($17\times 17$, training window $t=0$--$299$). White lines separate parameter groups: $L_1^w$ (8 weights), $L_1^b$ (4 biases), $L_2^w$ (4 weights), $L_2^b$ (1 bias). The $L_1^w$ and $L_2^w$ blocks exhibit near-uniform high coupling (orbital symmetry); l$_1$b$_3$ (column/row 11) is the most isolated parameter (mean dCor $0.215$).
  • Figure 5: Normalized transfer entropy TE$(i\!\to\!j)$ for Dataset D ($27\times 27$, training window $t=0$--$299$). Rows are source parameters, columns are targets. The L$_1\!\to\!$ L$_2$ sub-block (rows 0--11, cols 12--26) is systematically darker than the L$_2\!\to\!$ L$_1$ sub-block, quantifying the $1.95$ times hierarchical information flow from the spatial-embedding layer to the logit layer. This asymmetry diminishes in Dataset E ($1.22$ times) as sub-cluster ambiguity increases L$_2$--L$_1$ feedback.