Where Bits Matter in World Model Planning: A Paired Mixed-Bit Study for Efficient Spatial Reasoning
Suraj Ranganath, Anish Patnaik, Vaishak Menon
TL;DR
This work addresses how to deploy world-model planners under tight memory and latency budgets by asking whether planning effectiveness depends more on total bitwidth or on where bits are allocated between encoder and predictor. It introduces a paired mixed-bit evaluation on DINO-WM for the Wall task, comparing FP16, uniform INT{8,6,4,3}, mixed INT{8,6,4,3}, and asymmetric and layerwise variants across two budgets. The results reveal a structured three-regime landscape: 8/6-bit settings remain close to FP16, 3-bit settings collapse, and 4-bit settings are allocation-sensitive, with encoder-preserving configurations often outperforming uniform INT4; the effect persists across budgets and difficulty slices, though direction can shift in smaller samples. These findings motivate budget-aware, module-aware quantization policies to enable efficient spatial reasoning, suggesting that optimizing bit allocation directly for planning success is a promising direction for deployment under resource constraints.
Abstract
Efficient spatial reasoning requires world models that remain reliable under tight precision budgets. We study whether low-bit planning behavior is determined mostly by total bitwidth or by where bits are allocated across modules. Using DINO-WM on the Wall planning task, we run a paired-goal mixed-bit evaluation across uniform, mixed, asymmetric, and layerwise variants under two planner budgets. We observe a consistent three-regime pattern: 8-bit and 6-bit settings remain close to FP16, 3-bit settings collapse, and 4-bit settings are allocation-sensitive. In that transition region, preserving encoder precision improves planning relative to uniform quantization, and near-size asymmetric variants show the same encoder-side direction. In a later strict 22-cell replication with smaller per-cell episode count, the mixed-versus-uniform INT4 sign becomes budget-conditioned, which further highlights the sensitivity of this transition regime. These findings motivate module-aware, budget-aware quantization policies as a broader research direction for efficient spatial reasoning. Code and run artifacts are available at https://github.com/suraj-ranganath/DINO-MBQuant.
