Language-Guided World Models: A Model-Based Approach to AI Control
Alex Zhang, Khanh Nguyen, Jens Tuyls, Albert Lin, Karthik Narasimhan
TL;DR
This work tackles how to endow AI agents with controllable, language-grounded world models by formalizing environment dynamics as $M(s_{t+1}, r_{t+1}, d_{t+1} \\mid h_t, oldsymbol{v})$ and learning a language-conditioned model $M_{\theta}(s_{t+1}, r_{t+1}, d_{t+1} \\mid h_t, \boldsymbol{\ell}})$. It introduces LWMs and an EMMA-inspired multi-modal attention mechanism to ground language descriptions to entity attributes, and establishes Messenger-WM as a benchmark to probe compositional generalization. Empirically, Transformer baselines struggle on harder settings, while EMMA-LWM substantially improves grounding and trajectory simulation, approaching an oracle with semantic parsing; it also enables pre-execution plan discussions with humans, increasing safety and transparency. The work demonstrates that language-conditioned world models can enhance controllability and safety in AI systems, and suggests a research direction toward modular, language-parameterized architectures for scalable human-AI collaboration.
Abstract
This paper introduces the concept of Language-Guided World Models (LWMs) -- probabilistic models that can simulate environments by reading texts. Agents equipped with these models provide humans with more extensive and efficient control, allowing them to simultaneously alter agent behaviors in multiple tasks via natural verbal communication. In this work, we take initial steps in developing robust LWMs that can generalize to compositionally novel language descriptions. We design a challenging world modeling benchmark based on the game of MESSENGER (Hanjie et al., 2021), featuring evaluation settings that require varying degrees of compositional generalization. Our experiments reveal the lack of generalizability of the state-of-the-art Transformer model, as it offers marginal improvements in simulation quality over a no-text baseline. We devise a more robust model by fusing the Transformer with the EMMA attention mechanism (Hanjie et al., 2021). Our model substantially outperforms the Transformer and approaches the performance of a model with an oracle semantic parsing and grounding capability. To demonstrate the practicality of this model in improving AI safety and transparency, we simulate a scenario in which the model enables an agent to present plans to a human before execution, and to revise plans based on their language feedback.
