Active Epistemic Control for Query-Efficient Verified Planning
Shuhui Qu
TL;DR
Active Epistemic Control (AEC) introduces a grounded-verification layer for planning under partial observability by maintaining a clear separation between externally grounded facts $w$ and model-based beliefs $\hat{w}$. The framework uses a verifier dependent only on grounded evidence to certify feasibility, while epistemic actions (querying vs. simulating) prune candidate plans without letting predictions become truth. By discretizing uncertain predictions with an ambiguity margin and leveraging a pullback-based categorical consistency check (SQ-BCP), AEC achieves efficient planning with fewer replanning rounds on ALFWorld and robustness to distribution shifts on ScienceWorld. The approach provides a principled, model-agnostic safety gate atop LLM planners and world models, reducing the risk of silent infeasibility while preserving efficiency gains from prediction-based pruning. Overall, AEC demonstrates strong performance and generality across embodied planning benchmarks, offering a practical path to reliable planning under uncertainty.
Abstract
Planning in interactive environments is challenging under partial observability: task-critical preconditions (e.g., object locations or container states) may be unknown at decision time, yet grounding them through interaction is costly. Learned world models can cheaply predict missing facts, but prediction errors can silently induce infeasible commitments. We present \textbf{Active Epistemic Control (AEC)}, an epistemic-categorical planning layer that integrates model-based belief management with categorical feasibility checks. AEC maintains a strict separation between a \emph{grounded fact store} used for commitment and a \emph{belief store} used only for pruning candidate plans. At each step, it either queries the environment to ground an unresolved predicate when uncertainty is high or predictions are ambiguous, or simulates the predicate to filter hypotheses when confidence is sufficient. Final commitment is gated by grounded precondition coverage and an SQ-BCP pullback-style compatibility check, so simulated beliefs affect efficiency but cannot directly certify feasibility. Experiments on ALFWorld and ScienceWorld show that AEC achieves competitive success with fewer replanning rounds than strong LLM-agent baselines.
