ILCL: Inverse Logic-Constraint Learning from Temporally Constrained Demonstrations
Minwoo Cho, Jaehwi Jang, Daehyung Park
TL;DR
ILCL reframes temporal-constraint learning as a two-player zero-sum game between GA-TL-Mining and Logic-CRL to recover transferable TLTL constraints from demonstrations. GA-TL-Mining performs free-form TLTL syntax-tree mining while Logic-CRL trains policies that maximize rewards under the discovered TLTL constraints, using a PCMDP and a constraint-redistribution scheme to address non-Markovian, sparse evaluations. Across four simulated benchmarks and a real-world peg-in-shallow-hole transfer, ILCL yields lowest constraint-violation rates with expert-like rewards and demonstrates robust generalization to unseen environments, with ablations confirming the necessity of constraint redistribution. The approach advances interpretable, transferable temporal constraints for robotics, enabling constrained, high-reward behavior in diverse tasks and real-world settings.
Abstract
We aim to solve the problem of temporal-constraint learning from demonstrations to reproduce demonstration-like logic-constrained behaviors. Learning logic constraints is challenging due to the combinatorially large space of possible specifications and the ill-posed nature of non-Markovian constraints. To figure it out, we introduce a novel temporal-constraint learning method, which we call inverse logic-constraint learning (ILCL). Our method frames ICL as a two-player zero-sum game between 1) a genetic algorithm-based temporal-logic mining (GA-TL-Mining) and 2) logic-constrained reinforcement learning (Logic-CRL). GA-TL-Mining efficiently constructs syntax trees for parameterized truncated linear temporal logic (TLTL) without predefined templates. Subsequently, Logic-CRL finds a policy that maximizes task rewards under the constructed TLTL constraints via a novel constraint redistribution scheme. Our evaluations show ILCL outperforms state-of-the-art baselines in learning and transferring TL constraints on four temporally constrained tasks. We also demonstrate successful transfer to real-world peg-in-shallow-hole tasks.
