Objects in Generated Videos Are Slower Than They Appear: Models Suffer Sub-Earth Gravity and Don't Know Galileo's Principle...for now

Varun Varma Thozhiyoor; Shivam Tripathi; Venkatesh Babu Radhakrishnan; Anand Bhattad

Objects in Generated Videos Are Slower Than They Appear: Models Suffer Sub-Earth Gravity and Don't Know Galileo's Principle...for now

Varun Varma Thozhiyoor, Shivam Tripathi, Venkatesh Babu Radhakrishnan, Anand Bhattad

TL;DR

This work reveals that contemporary video generators systematically under-accelerate falling objects, effectively exhibiting sub-Earth gravity, and often violate Galileo's principle that equal gravitational acceleration applies regardless of starting height. The authors introduce a unit-free, two-object timing protocol that cancels scale and time-base confounds, providing a robust diagnostic of physical understanding. They show widespread gravity violations across state-of-the-art models and demonstrate that a lightweight LoRA adapter trained on 100 synthetic sequences can substantially correct gravity metrics and generalize to zero-shot two-ball drops, inclined planes, and real-world data (PISA). The findings suggest video models encode latent physical structure but require targeted specialization to act as reliable world models, with implications for physics-guided generation and efficient correction strategies. Overall, the paper establishes a calibration-free framework for probing fundamental physics in video generation and demonstrates that minimal data can meaningfully align models with physical laws.

Abstract

Video generators are increasingly evaluated as potential world models, which requires them to encode and understand physical laws. We investigate their representation of a fundamental law: gravity. Out-of-the-box video generators consistently generate objects falling at an effectively slower acceleration. However, these physical tests are often confounded by ambiguous metric scale. We first investigate if observed physical errors are artifacts of these ambiguities (e.g., incorrect frame rate assumptions). We find that even temporal rescaling cannot correct the high-variance gravity artifacts. To rigorously isolate the underlying physical representation from these confounds, we introduce a unit-free, two-object protocol that tests the timing ratio $t_1^2/t_2^2 = h_1/h_2$, a relationship independent of $g$, focal length, and scale. This relative test reveals violations of Galileo's equivalence principle. We then demonstrate that this physical gap can be partially mitigated with targeted specialization. A lightweight low-rank adaptor fine-tuned on only 100 single-ball clips raises $g_{\mathrm{eff}}$ from $1.81\,\mathrm{m/s^2}$ to $6.43\,\mathrm{m/s^2}$ (reaching $65\%$ of terrestrial gravity). This specialist adaptor also generalizes zero-shot to two-ball drops and inclined planes, offering initial evidence that specific physical laws can be corrected with minimal data.

Objects in Generated Videos Are Slower Than They Appear: Models Suffer Sub-Earth Gravity and Don't Know Galileo's Principle...for now

TL;DR

Abstract

Objects in Generated Videos Are Slower Than They Appear: Models Suffer Sub-Earth Gravity and Don't Know Galileo's Principle...for now

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)