Table of Contents
Fetching ...

Symbolic Manipulation Planning with Discovered Object and Relational Predicates

Alper Ahmetoglu, Erhan Oztop, Emre Ugur

TL;DR

The paper tackles learning symbolic representations from unsupervised perception to enable long-horizon planning. It introduces a Relational DeepSym-based architecture that learns unary object predicates and binary relational predicates, aggregates them with actions, and induces symbolic operators that are translated into PDDL for use with standard planners. Empirical results on tabletop object stacking show that explicit relational symbols improve planning performance and generalize to configurations and object counts beyond the training set, including a real-world demonstration. The work demonstrates a practical integration of learned perceptual knowledge with classical planning, highlighting gains in sample efficiency and scalability for multi-object manipulation tasks.

Abstract

Discovering the symbols and rules that can be used in long-horizon planning from a robot's unsupervised exploration of its environment and continuous sensorimotor experience is a challenging task. The previous studies proposed learning symbols from single or paired object interactions and planning with these symbols. In this work, we propose a system that learns rules with discovered object and relational symbols that encode an arbitrary number of objects and the relations between them, converts those rules to Planning Domain Description Language (PDDL), and generates plans that involve affordances of the arbitrary number of objects to achieve tasks. We validated our system with box-shaped objects in different sizes and showed that the system can develop a symbolic knowledge of pick-up, carry, and place operations, taking into account object compounds in different configurations, such as boxes would be carried together with a larger box that they are placed on. We also compared our method with the state-of-the-art methods and showed that planning with the operators defined over relational symbols gives better planning performance compared to the baselines.

Symbolic Manipulation Planning with Discovered Object and Relational Predicates

TL;DR

The paper tackles learning symbolic representations from unsupervised perception to enable long-horizon planning. It introduces a Relational DeepSym-based architecture that learns unary object predicates and binary relational predicates, aggregates them with actions, and induces symbolic operators that are translated into PDDL for use with standard planners. Empirical results on tabletop object stacking show that explicit relational symbols improve planning performance and generalize to configurations and object counts beyond the training set, including a real-world demonstration. The work demonstrates a practical integration of learned perceptual knowledge with classical planning, highlighting gains in sample efficiency and scalability for multi-object manipulation tasks.

Abstract

Discovering the symbols and rules that can be used in long-horizon planning from a robot's unsupervised exploration of its environment and continuous sensorimotor experience is a challenging task. The previous studies proposed learning symbols from single or paired object interactions and planning with these symbols. In this work, we propose a system that learns rules with discovered object and relational symbols that encode an arbitrary number of objects and the relations between them, converts those rules to Planning Domain Description Language (PDDL), and generates plans that involve affordances of the arbitrary number of objects to achieve tasks. We validated our system with box-shaped objects in different sizes and showed that the system can develop a symbolic knowledge of pick-up, carry, and place operations, taking into account object compounds in different configurations, such as boxes would be carried together with a larger box that they are placed on. We also compared our method with the state-of-the-art methods and showed that planning with the operators defined over relational symbols gives better planning performance compared to the baselines.
Paper Structure (17 sections, 8 equations, 6 figures, 1 table)

This paper contains 17 sections, 8 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: An overview of the proposed method. (Top) The encoder network $\sigma_p$ learns unary symbols over object features while the self-attention network $\sigma_r$ learns relational symbols. The aggregation in the middle (see Equation \ref{['eq:agg']}) fuses unary and relational symbols with action into a vector representation for each object, which is given as input to the decoder network $g$ to predict the effect of the executed action. (Bottom) After learning operators defined over the learned symbols, we can find a sequence of actions that reach the goal state from the initial state with AI planners.
  • Figure 2: An example generated PDDL action schema. This action encodes pick-place($?o1$, 'center', $?o0$, 'right') where $?o0$ and $?o1$ are free variables. The action schema is the 20th most frequent action in the dataset, with 736 occurrences out of 160K samples.
  • Figure 3: Effect prediction errors for different numbers of actions.
  • Figure 4: The planning performance for different numbers of objects over three runs with 100 random problem pairs.
  • Figure 5: Given an initial environment configuration in the first column, our model can generate an action sequence reaching the goal state.
  • ...and 1 more figures