Table of Contents
Fetching ...

Physically Interpretable World Models via Weakly Supervised Representation Learning

Zhenjiang Mao, Mrinall Eashaan Umasudhan, Ivan Ruchkin

TL;DR

This work tackles the lack of physical interpretability in learned latent representations for autonomous CPS. It introduces Physically Interpretable World Models (PIWM), which align latent states with real physical quantities and constrain their evolution using a partially known dynamics form, guided by weak distributional supervision. The approach formalizes physical interpretability, explores intrinsic versus extrinsic encoders and continuous versus discrete latents, and demonstrates that extrinsic, discrete latents provide the strongest grounding and long-horizon prediction across CartPole, Lunar Lander, and DonkeyCar, including recovery of true system parameters. The results suggest that incorporating physical priors and weak supervision yields more trustworthy, generalizable, and safety-relevant world representations from images, with broad implications for CPS safety, monitoring, and planning.

Abstract

Learning predictive models from high-dimensional sensory observations is fundamental for cyber-physical systems, yet the latent representations learned by standard world models lack physical interpretability. This limits their reliability, generalizability, and applicability to safety-critical tasks. We introduce Physically Interpretable World Models (PIWM), a framework that aligns latent representations with real-world physical quantities and constrains their evolution through partially known physical dynamics. Physical interpretability in PIWM is defined by two complementary properties: (i) the learned latent state corresponds to meaningful physical variables, and (ii) its temporal evolution follows physically consistent dynamics. To achieve this without requiring ground-truth physical annotations, PIWM employs weak distribution-based supervision that captures state uncertainty naturally arising from real-world sensing pipelines. The architecture integrates a VQ-based visual encoder, a transformer-based physical encoder, and a learnable dynamics model grounded in known physical equations. Across three case studies (Cart Pole, Lunar Lander, and Donkey Car), PIWM achieves accurate long-horizon prediction, recovers true system parameters, and significantly improves physical grounding over purely data-driven models. These results demonstrate the feasibility and advantages of learning physically interpretable world models directly from images under weak supervision.

Physically Interpretable World Models via Weakly Supervised Representation Learning

TL;DR

This work tackles the lack of physical interpretability in learned latent representations for autonomous CPS. It introduces Physically Interpretable World Models (PIWM), which align latent states with real physical quantities and constrain their evolution using a partially known dynamics form, guided by weak distributional supervision. The approach formalizes physical interpretability, explores intrinsic versus extrinsic encoders and continuous versus discrete latents, and demonstrates that extrinsic, discrete latents provide the strongest grounding and long-horizon prediction across CartPole, Lunar Lander, and DonkeyCar, including recovery of true system parameters. The results suggest that incorporating physical priors and weak supervision yields more trustworthy, generalizable, and safety-relevant world representations from images, with broad implications for CPS safety, monitoring, and planning.

Abstract

Learning predictive models from high-dimensional sensory observations is fundamental for cyber-physical systems, yet the latent representations learned by standard world models lack physical interpretability. This limits their reliability, generalizability, and applicability to safety-critical tasks. We introduce Physically Interpretable World Models (PIWM), a framework that aligns latent representations with real-world physical quantities and constrains their evolution through partially known physical dynamics. Physical interpretability in PIWM is defined by two complementary properties: (i) the learned latent state corresponds to meaningful physical variables, and (ii) its temporal evolution follows physically consistent dynamics. To achieve this without requiring ground-truth physical annotations, PIWM employs weak distribution-based supervision that captures state uncertainty naturally arising from real-world sensing pipelines. The architecture integrates a VQ-based visual encoder, a transformer-based physical encoder, and a learnable dynamics model grounded in known physical equations. Across three case studies (Cart Pole, Lunar Lander, and Donkey Car), PIWM achieves accurate long-horizon prediction, recovers true system parameters, and significantly improves physical grounding over purely data-driven models. These results demonstrate the feasibility and advantages of learning physically interpretable world models directly from images under weak supervision.

Paper Structure

This paper contains 25 sections, 25 equations, 8 figures, 2 tables, 1 algorithm.

Figures (8)

  • Figure 1: Conceptual illustration of learning physically meaningful latent spaces from images. High-dimensional observations $y$ may be encoded into a standard latent space $z$ or an interpretable latent space $z^*$. Weak supervision vaguely relates observations to the underlying physical state $x$. Existing approaches follow two paradigms: intrinsic autoencoding (one-stage) and extrinsic autoencoding (two-stage).
  • Figure 2: Overview of world model architectures. (Left) Standard world model uses an encoder-decoder structure with a data-driven predictor (e.g., LSTM) to model latent dynamics, where the latent representations lack physical meaning. (Right) The PIWM architecture learns a structured latent representation $z^*$ from images, uses a learnable dynamics model $\phi$ with physical priors to predict future latent states, and decodes them into future images.
  • Figure 3: Prediction performance. The Root Mean Square Error (RMSE) of our PIWM variants (extrinsic methods, left; intrinsic methods, right) is compared against baselines over a 30-step prediction horizon in the Donkey Car and Lunar Lander across varying levels of weak supervision ($\delta$).
  • Figure 4: Prediction performance of CartPole. The Root Mean Square Error (RMSE) of our PIWM variants (extrinsic methods, upper; intrinsic methods, lower) is compared against baselines over a 30-step prediction horizon in the CartPole across varying levels of weak supervision ($\delta$).
  • Figure 5: Learned physical parameters vs. ground truth. For Cart Pole and Lunar Lander, the parameters learned by our model (colored bars) are compared against the known ground-truth values (yellow lines) under varying noise levels ($\delta$). For the Donkey Car, we use only an approximate bicycle model and the true physical parameters are unknown; therefore, no ground-truth reference is shown.
  • ...and 3 more figures

Theorems & Definitions (4)

  • Definition 1: Autonomous CPS
  • Definition 2: World Model
  • Definition 3: Interpretable Representation Learning Problem
  • Definition 4: Prediction Problem for Interpretable Representations