Heuristic Adaptation of Potentially Misspecified Domain Support for Likelihood-Free Inference in Stochastic Dynamical Systems
Georgios Kamaras, Craig Innes, Subramanian Ramamoorthy
TL;DR
This paper addresses the problem that a fixed domain support in likelihood-free inference (LFI) can yield suboptimal and overconfident posteriors in sim-to-real robotics tasks. It introduces three heuristic BayesSim variants—EDGE, MODE, and CENTRE—to adapt the parameter support during iterative LFI, and demonstrates their effects on two stochastic benchmarks (Lotka-Volterra and M/G/1) as well as a high-dimensional deformable linear object (DLO) whipping task within a Real2Sim2Real framework. The EDGE variant, based on posterior mass near domain edges, emerges as the most robust across tasks, improving both posterior fidelity and domain sampling for downstream domain randomisation in RL. Empirically, adaptive support leads to stronger object-centric policy adaptation and improved real-world performance on several DLO instances, while also highlighting the need for careful tuning and a feasible domain definition. The work suggests a path toward relaxing prior assumptions in LFI and integrating adaptive support with Bed/BO-inspired design, with future directions including automatic feasibility checks and more efficient inference steps.
Abstract
In robotics, likelihood-free inference (LFI) can provide the domain distribution that adapts a learnt agent in a parametric set of deployment conditions. LFI assumes an arbitrary support for sampling, which remains constant as the initial generic prior is iteratively refined to more descriptive posteriors. However, a potentially misspecified support can lead to suboptimal, yet falsely certain, posteriors. To address this issue, we propose three heuristic LFI variants: EDGE, MODE, and CENTRE. Each interprets the posterior mode shift over inference steps in its own way and, when integrated into an LFI step, adapts the support alongside posterior inference. We first expose the support misspecification issue and evaluate our heuristics using stochastic dynamical benchmarks. We then evaluate the impact of heuristic support adaptation on parameter inference and policy learning for a dynamic deformable linear object (DLO) manipulation task. Inference results in a finer length and stiffness classification for a parametric set of DLOs. When the resulting posteriors are used as domain distributions for sim-based policy learning, they lead to more robust object-centric agent performance.
