Table of Contents
Fetching ...

NeSIG: A Neuro-Symbolic Method for Learning to Generate Planning Problems

Carlos Núñez-Molina, Pablo Mesejo, Juan Fernández-Olivares

TL;DR

NeSIG introduces a domain-independent method for automatically generating planning problems that are valid, diverse, and difficult by formulating problem generation as a Markov Decision Process and training two neural-logic policies with reinforcement learning. It leverages Neural Logic Machines to encode inductive logic reasoning for both initial-state and goal generation, guided by a semi-declarative consistency language. Key contributions include a formalization of problem properties (validity, diversity, difficulty), an MDP reward structure that balances these aspects, and strong empirical results showing substantially higher difficulty than domain-specific generators while maintaining diversity and achieving good generalization to larger problem sizes. The approach reduces human design effort and generalizes across multiple classic domains, with potential for curriculum generation and adversarial problem generation in planning literature.

Abstract

In the field of Automated Planning there is often the need for a set of planning problems from a particular domain, e.g., to be used as training data for Machine Learning or as benchmarks in planning competitions. In most cases, these problems are created either by hand or by a domain-specific generator, putting a burden on the human designers. In this paper we propose NeSIG, to the best of our knowledge the first domain-independent method for automatically generating planning problems that are valid, diverse and difficult to solve. We formulate problem generation as a Markov Decision Process and train two generative policies with Deep Reinforcement Learning to generate problems with the desired properties. We conduct experiments on three classical domains, comparing our approach against handcrafted, domain-specific instance generators and various ablations. Results show NeSIG is able to automatically generate valid and diverse problems of much greater difficulty (15.5 times more on geometric average) than domain-specific generators, while simultaneously reducing human effort when compared to them. Additionally, it can generalize to larger problems than those seen during training.

NeSIG: A Neuro-Symbolic Method for Learning to Generate Planning Problems

TL;DR

NeSIG introduces a domain-independent method for automatically generating planning problems that are valid, diverse, and difficult by formulating problem generation as a Markov Decision Process and training two neural-logic policies with reinforcement learning. It leverages Neural Logic Machines to encode inductive logic reasoning for both initial-state and goal generation, guided by a semi-declarative consistency language. Key contributions include a formalization of problem properties (validity, diversity, difficulty), an MDP reward structure that balances these aspects, and strong empirical results showing substantially higher difficulty than domain-specific generators while maintaining diversity and achieving good generalization to larger problem sizes. The approach reduces human design effort and generalizes across multiple classic domains, with potential for curriculum generation and adversarial problem generation in planning literature.

Abstract

In the field of Automated Planning there is often the need for a set of planning problems from a particular domain, e.g., to be used as training data for Machine Learning or as benchmarks in planning competitions. In most cases, these problems are created either by hand or by a domain-specific generator, putting a burden on the human designers. In this paper we propose NeSIG, to the best of our knowledge the first domain-independent method for automatically generating planning problems that are valid, diverse and difficult to solve. We formulate problem generation as a Markov Decision Process and train two generative policies with Deep Reinforcement Learning to generate problems with the desired properties. We conduct experiments on three classical domains, comparing our approach against handcrafted, domain-specific instance generators and various ablations. Results show NeSIG is able to automatically generate valid and diverse problems of much greater difficulty (15.5 times more on geometric average) than domain-specific generators, while simultaneously reducing human effort when compared to them. Additionally, it can generalize to larger problems than those seen during training.
Paper Structure (21 sections, 4 figures, 5 tables)

This paper contains 21 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: NeSIG.a) Architecture overview. NeSIG receives as inputs a PDDL domain, several consistency rules and some extra information (maximum problem size and goal types and predicates). It then trains two generative policies with RL (see subfigure b) so that they learn to generate valid, diverse and difficult problems for the domain provided as input. b) Policy training with RL. Dashed lines represent the application of several MDP actions, corresponding to adding an atom to the initial state in the case of the initial state policy (see subfigure c), or executing a domain action in the goal state in the case of the goal policy (see subfigure d). Dotted lines indicate the reward signal, accounting for the consistency $r_c$, diversity $r_v$ and difficulty $r_f$ of the problems generated. c) Initial state policy. It receives an MDP state $(s_{ic}, \_)$ corresponding to a partially-generated initial state and selects the next atom to add to $s_{ic}$. d) Goal policy. It receives an MDP state $(s_i, s_{gc})$ representing a complete initial state but a partially-generated goal state and selects the next domain action to execute in $s_{gc}$.
  • Figure 2: Problem size generalization results. The plots show the mean difficulty (in log scale) obtained by NeSIG across five different seeds, when tested on larger (and smaller) problems than those seen during training. We also plot the problem difficulty of the domain-specific generators (ad hoc models) for comparison purposes. In blocksworld and logistics, problem size is measured as the maximum number of atoms allowed in the initial state $s_i$. In sokoban, it is measured by the map size NxM. The maximum number of initial state and goal actions used by NeSIG for each problem size, along with the parameters of the ad hoc models, are detailed in the Appendix.
  • Figure 3: Problem difficulty for A*+LM-cut. The plots show the mean difficulty (in log scale), measured as the number of expanded nodes, obtained by NeSIG and ad hoc models with the A*+LM-cut optimal planning algorithm.
  • Figure 4: Problem difficulty for FDSS. The plots show the mean difficulty (in log scale), measured as the planning time in seconds, obtained by NeSIG and ad hoc models with the FDSS optimal planning portfolio.