NeSIG: A Neuro-Symbolic Method for Learning to Generate Planning Problems
Carlos Núñez-Molina, Pablo Mesejo, Juan Fernández-Olivares
TL;DR
NeSIG introduces a domain-independent method for automatically generating planning problems that are valid, diverse, and difficult by formulating problem generation as a Markov Decision Process and training two neural-logic policies with reinforcement learning. It leverages Neural Logic Machines to encode inductive logic reasoning for both initial-state and goal generation, guided by a semi-declarative consistency language. Key contributions include a formalization of problem properties (validity, diversity, difficulty), an MDP reward structure that balances these aspects, and strong empirical results showing substantially higher difficulty than domain-specific generators while maintaining diversity and achieving good generalization to larger problem sizes. The approach reduces human design effort and generalizes across multiple classic domains, with potential for curriculum generation and adversarial problem generation in planning literature.
Abstract
In the field of Automated Planning there is often the need for a set of planning problems from a particular domain, e.g., to be used as training data for Machine Learning or as benchmarks in planning competitions. In most cases, these problems are created either by hand or by a domain-specific generator, putting a burden on the human designers. In this paper we propose NeSIG, to the best of our knowledge the first domain-independent method for automatically generating planning problems that are valid, diverse and difficult to solve. We formulate problem generation as a Markov Decision Process and train two generative policies with Deep Reinforcement Learning to generate problems with the desired properties. We conduct experiments on three classical domains, comparing our approach against handcrafted, domain-specific instance generators and various ablations. Results show NeSIG is able to automatically generate valid and diverse problems of much greater difficulty (15.5 times more on geometric average) than domain-specific generators, while simultaneously reducing human effort when compared to them. Additionally, it can generalize to larger problems than those seen during training.
