InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning
Muzhi Han, Yifeng Zhu, Song-Chun Zhu, Ying Nian Wu, Yuke Zhu
TL;DR
InterPreT presents an interactive framework that learns planning-oriented predicates from language feedback during embodied robot interaction, grounding predicates as Python functions and compiling them into a PDDL domain for planning with a classical planner. By iteratively Reasoner-Coder-Corrector cycles and applying a cluster-and-search operator-learning approach, it achieves strong compositional generalization to unseen tasks in both simulated and real-world manipulation domains. The method leverages GPT-4 for predicate generation and goal translation, enabling efficient, few-shot grounding and robust planning guarantees when combined with PDDL. Empirical results show substantial improvements over LLM-only baselines, with notable performance in challenging generalization settings and demonstrated viability on real robots, albeit with perceptual and execution noise still posing challenges. The work highlights the value of interactive, language-guided predicate learning to fuse learning and symbolic planning for scalable, long-horizon robotic manipulation.
Abstract
Learning abstract state representations and knowledge is crucial for long-horizon robot planning. We present InterPreT, an LLM-powered framework for robots to learn symbolic predicates from language feedback of human non-experts during embodied interaction. The learned predicates provide relational abstractions of the environment state, facilitating the learning of symbolic operators that capture action preconditions and effects. By compiling the learned predicates and operators into a PDDL domain on-the-fly, InterPreT allows effective planning toward arbitrary in-domain goals using a PDDL planner. In both simulated and real-world robot manipulation domains, we demonstrate that InterPreT reliably uncovers the key predicates and operators governing the environment dynamics. Although learned from simple training tasks, these predicates and operators exhibit strong generalization to novel tasks with significantly higher complexity. In the most challenging generalization setting, InterPreT attains success rates of 73% in simulation and 40% in the real world, substantially outperforming baseline methods.
