InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning

Muzhi Han; Yifeng Zhu; Song-Chun Zhu; Ying Nian Wu; Yuke Zhu

InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning

Muzhi Han, Yifeng Zhu, Song-Chun Zhu, Ying Nian Wu, Yuke Zhu

TL;DR

InterPreT presents an interactive framework that learns planning-oriented predicates from language feedback during embodied robot interaction, grounding predicates as Python functions and compiling them into a PDDL domain for planning with a classical planner. By iteratively Reasoner-Coder-Corrector cycles and applying a cluster-and-search operator-learning approach, it achieves strong compositional generalization to unseen tasks in both simulated and real-world manipulation domains. The method leverages GPT-4 for predicate generation and goal translation, enabling efficient, few-shot grounding and robust planning guarantees when combined with PDDL. Empirical results show substantial improvements over LLM-only baselines, with notable performance in challenging generalization settings and demonstrated viability on real robots, albeit with perceptual and execution noise still posing challenges. The work highlights the value of interactive, language-guided predicate learning to fuse learning and symbolic planning for scalable, long-horizon robotic manipulation.

Abstract

Learning abstract state representations and knowledge is crucial for long-horizon robot planning. We present InterPreT, an LLM-powered framework for robots to learn symbolic predicates from language feedback of human non-experts during embodied interaction. The learned predicates provide relational abstractions of the environment state, facilitating the learning of symbolic operators that capture action preconditions and effects. By compiling the learned predicates and operators into a PDDL domain on-the-fly, InterPreT allows effective planning toward arbitrary in-domain goals using a PDDL planner. In both simulated and real-world robot manipulation domains, we demonstrate that InterPreT reliably uncovers the key predicates and operators governing the environment dynamics. Although learned from simple training tasks, these predicates and operators exhibit strong generalization to novel tasks with significantly higher complexity. In the most challenging generalization setting, InterPreT attains success rates of 73% in simulation and 40% in the real world, substantially outperforming baseline methods.

InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning

TL;DR

Abstract

Paper Structure (25 sections, 1 equation, 5 figures, 7 tables)

This paper contains 25 sections, 1 equation, 5 figures, 7 tables.

Introduction
Related Work
Learning Symbolic Representations for Planning
Large Language Models-enabled Planning and Learning
Preliminaries and Problem Setup
Method
Reasoner
Goal-related feedback
Precondition-related feedback
Coder
Corrector
Other Components
Experiments
Experimental Setup
Domain design
...and 10 more sections

Figures (5)

Figure 1: InterPreT learns predicates as Python functions and operators in PDDL from human language feedback during embodied interaction. The learned predicates and operators can be leveraged by a PDDL planner for planning for unseen tasks involving more objects and novel goals.
Figure 2: The system architecture of InterPreT. (a) We design three GPT-4-enabled modules that operate sequentially to identify planning-oriented predicates and generate the predicate functions based on language feedback. (b) An example predicate function learned. (c) With the learned predicates, we learn PDDL operators with a cluster-then-search algorithm. (d) The learned predicates and operators enable effective task planning, after we translate language goals into symbolic goals with GPT-4.
Figure 3: Simulated and real-world domains used in the experiments. We show example training tasks of all five domains in (a) and demonstrate the design of the 4 test sets in the real-world SetTable domain in (b). In More objects and Combined, an unseen object "spoon" introduces additional generalization challenges.
Figure 4: Visualization of one training run in simulated StoreObjects domain. The total number of learned predicates increases by 2 for each labeled predicate as we also learn its negation. We provide the predicate function of obj_in_gripper as an in-context example, known at Step 0. We empirically label the learned operators with semantic names based on their interpreted meanings.
Figure 5: Real-robot evaluations in real-world StoreObjects and SetTable domains. We train InterPreT once on 10 tasks and test on 5 tasks per test set. Note that the predicate learning in SetTable is bootstrapped from predicates learned in StoreObjects.

InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning

TL;DR

Abstract

InterPreT: Interactive Predicate Learning from Language Feedback for Generalizable Task Planning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)