Discovering High Level Patterns from Simulation Traces
Sean Memery, Kartic Subr
TL;DR
This work tackles the challenge that language models struggle to reason about physics without ground truth simulation data. It proposes learning a library of high-level event patterns by evolving detectors that annotate detailed simulation traces into Annotated Simulation Traces (AST), enabling natural language reasoning, planning, and reward-program synthesis. The approach combines NL-guided pattern discovery with FunSearch-style program synthesis to grow the pattern library from seed descriptions and to produce executable reward programs for trajectory optimization. Evaluations on the Phyre physics benchmark and a Phyre-derived Q&A task show that ASTs improve LM summarization, question answering, and the quality of learned reward functions, while enabling more efficient optimization and downstream training of value networks. Overall, the pattern-based abstraction provides a scalable, interpretable bridge between physics simulations and NL reasoning, with broad implications for NL-guided control and learning in physics-rich environments.
Abstract
Artificial intelligence (AI) agents embedded in environments with physics-based interaction face many challenges including reasoning, planning, summarization, and question answering. This problem is exacerbated when a human user wishes to either guide or interact with the agent in natural language. Although the use of Language Models (LMs) is the default choice, as an AI tool, they struggle with tasks involving physics. The LM's capability for physical reasoning is learned from observational data, rather than being grounded in simulation. A common approach is to include simulation traces as context, but this suffers from poor scalability as simulation traces contain larger volumes of fine-grained numerical and semantic data. In this paper, we propose a natural language guided method to discover coarse-grained patterns (e.g., 'rigid-body collision', 'stable support', etc.) from detailed simulation logs. Specifically, we synthesize programs that operate on simulation logs and map them to a series of high level activated patterns. We show, through two physics benchmarks, that this annotated representation of the simulation log is more amenable to natural language reasoning about physical systems. We demonstrate how this method enables LMs to generate effective reward programs from goals specified in natural language, which may be used within the context of planning or supervised learning.
