Continual learning and refinement of causal models through dynamic predicate invention

Enrique Crespo-Fernandez; Oliver Ray; Telmo de Menezes e Silva Filho; Peter Flach

Continual learning and refinement of causal models through dynamic predicate invention

Enrique Crespo-Fernandez, Oliver Ray, Telmo de Menezes e Silva Filho, Peter Flach

TL;DR

This work proposes a framework for constructing symbolic causal world models entirely online by integrating continuous model learning and repair into the agent's decision loop, by leveraging the power of Meta-Interpretive Learning and predicate invention to find semantically meaningful and reusable abstractions.

Abstract

Efficiently navigating complex environments requires agents to internalize the underlying logic of their world, yet standard world modelling methods often struggle with sample inefficiency, lack of transparency, and poor scalability. We propose a framework for constructing symbolic causal world models entirely online by integrating continuous model learning and repair into the agent's decision loop, by leveraging the power of Meta-Interpretive Learning and predicate invention to find semantically meaningful and reusable abstractions, allowing an agent to construct a hierarchy of disentangled, high-quality concepts from its observations. We demonstrate that our lifted inference approach scales to domains with complex relational dynamics, where propositional methods suffer from combinatorial explosion, while achieving sample-efficiency orders of magnitude higher than the established PPO neural-network-based baseline.

Continual learning and refinement of causal models through dynamic predicate invention

TL;DR

Abstract

Paper Structure (15 sections, 7 equations, 3 figures, 2 algorithms)

This paper contains 15 sections, 7 equations, 3 figures, 2 algorithms.

Introduction
Method
Problem Definition
Metarule-Guided Hypothesis construction and refinement
Complexity Analysis
Experiments
Online Model Repair and Convergence
Sample Efficiency and Scale Invariance
Semantic alignment and Interpretability
Related Work
Conclusion
Theoretical Guarantees
Lattice-Based Refinement
Completeness and Expressivity
Algorithms

Figures (3)

Figure 1: Visualization of a selected set of rules from the learnt symbolic causal model on the MiniHack 'Lava Crossing' task. The environment is modelled as a hierarchy of interpretable concepts, transition rules, and physical constraints. (Left) State Interpretation: Colored overlays illustrate how the Learnt Abstractions (Right) ground to specific regions of the state space. The agent ($p4$, purple) senses its neighborhood ($p3$, cyan). The concept of "moving" ($p2$, orange) is reused in two rules: the one modelling movement ($p1$) and the one modelling death ($p5$). (Center) Dynamics & Constraints: The Learnt Dynamics use these high-level abstractions to predict state evolution. For example, the dying rule is triggered only when the abstract condition $p5$ (moving into lava) is met. Learnt Constraints enforce physical consistency, such as mutual exclusion ($\otimes$), ensuring the agent cannot be simultaneously alive and dead or occupy multiple coordinates.
Figure 2: Our system vs PPO on the 10$\times$10 grid version of the MiniHack 'Lava Crossing' task. (a) Our system converges to 43 clauses (28 abstractions, 13 dynamics, 2 constraints) by step 23. (b) Our system reaches the goal at episode 2 since then it is able to consistently navigate the environment to it relying on it learn model; PPO requires 129 episodes for first success and it is not until episode 300 that it starts to converge.
Figure 3: Lattice traversal dynamics. Prediction errors trigger Generalization (moving to $\top$) or Specialization (pruning to $H$).

Continual learning and refinement of causal models through dynamic predicate invention

TL;DR

Abstract

Continual learning and refinement of causal models through dynamic predicate invention

Authors

TL;DR

Abstract

Table of Contents

Figures (3)