Physically Interpretable World Models via Weakly Supervised Representation Learning
Zhenjiang Mao, Mrinall Eashaan Umasudhan, Ivan Ruchkin
TL;DR
This work tackles the lack of physical interpretability in learned latent representations for autonomous CPS. It introduces Physically Interpretable World Models (PIWM), which align latent states with real physical quantities and constrain their evolution using a partially known dynamics form, guided by weak distributional supervision. The approach formalizes physical interpretability, explores intrinsic versus extrinsic encoders and continuous versus discrete latents, and demonstrates that extrinsic, discrete latents provide the strongest grounding and long-horizon prediction across CartPole, Lunar Lander, and DonkeyCar, including recovery of true system parameters. The results suggest that incorporating physical priors and weak supervision yields more trustworthy, generalizable, and safety-relevant world representations from images, with broad implications for CPS safety, monitoring, and planning.
Abstract
Learning predictive models from high-dimensional sensory observations is fundamental for cyber-physical systems, yet the latent representations learned by standard world models lack physical interpretability. This limits their reliability, generalizability, and applicability to safety-critical tasks. We introduce Physically Interpretable World Models (PIWM), a framework that aligns latent representations with real-world physical quantities and constrains their evolution through partially known physical dynamics. Physical interpretability in PIWM is defined by two complementary properties: (i) the learned latent state corresponds to meaningful physical variables, and (ii) its temporal evolution follows physically consistent dynamics. To achieve this without requiring ground-truth physical annotations, PIWM employs weak distribution-based supervision that captures state uncertainty naturally arising from real-world sensing pipelines. The architecture integrates a VQ-based visual encoder, a transformer-based physical encoder, and a learnable dynamics model grounded in known physical equations. Across three case studies (Cart Pole, Lunar Lander, and Donkey Car), PIWM achieves accurate long-horizon prediction, recovers true system parameters, and significantly improves physical grounding over purely data-driven models. These results demonstrate the feasibility and advantages of learning physically interpretable world models directly from images under weak supervision.
