Table of Contents
Fetching ...

Heuristic Adaptation of Potentially Misspecified Domain Support for Likelihood-Free Inference in Stochastic Dynamical Systems

Georgios Kamaras, Craig Innes, Subramanian Ramamoorthy

TL;DR

This paper addresses the problem that a fixed domain support in likelihood-free inference (LFI) can yield suboptimal and overconfident posteriors in sim-to-real robotics tasks. It introduces three heuristic BayesSim variants—EDGE, MODE, and CENTRE—to adapt the parameter support during iterative LFI, and demonstrates their effects on two stochastic benchmarks (Lotka-Volterra and M/G/1) as well as a high-dimensional deformable linear object (DLO) whipping task within a Real2Sim2Real framework. The EDGE variant, based on posterior mass near domain edges, emerges as the most robust across tasks, improving both posterior fidelity and domain sampling for downstream domain randomisation in RL. Empirically, adaptive support leads to stronger object-centric policy adaptation and improved real-world performance on several DLO instances, while also highlighting the need for careful tuning and a feasible domain definition. The work suggests a path toward relaxing prior assumptions in LFI and integrating adaptive support with Bed/BO-inspired design, with future directions including automatic feasibility checks and more efficient inference steps.

Abstract

In robotics, likelihood-free inference (LFI) can provide the domain distribution that adapts a learnt agent in a parametric set of deployment conditions. LFI assumes an arbitrary support for sampling, which remains constant as the initial generic prior is iteratively refined to more descriptive posteriors. However, a potentially misspecified support can lead to suboptimal, yet falsely certain, posteriors. To address this issue, we propose three heuristic LFI variants: EDGE, MODE, and CENTRE. Each interprets the posterior mode shift over inference steps in its own way and, when integrated into an LFI step, adapts the support alongside posterior inference. We first expose the support misspecification issue and evaluate our heuristics using stochastic dynamical benchmarks. We then evaluate the impact of heuristic support adaptation on parameter inference and policy learning for a dynamic deformable linear object (DLO) manipulation task. Inference results in a finer length and stiffness classification for a parametric set of DLOs. When the resulting posteriors are used as domain distributions for sim-based policy learning, they lead to more robust object-centric agent performance.

Heuristic Adaptation of Potentially Misspecified Domain Support for Likelihood-Free Inference in Stochastic Dynamical Systems

TL;DR

This paper addresses the problem that a fixed domain support in likelihood-free inference (LFI) can yield suboptimal and overconfident posteriors in sim-to-real robotics tasks. It introduces three heuristic BayesSim variants—EDGE, MODE, and CENTRE—to adapt the parameter support during iterative LFI, and demonstrates their effects on two stochastic benchmarks (Lotka-Volterra and M/G/1) as well as a high-dimensional deformable linear object (DLO) whipping task within a Real2Sim2Real framework. The EDGE variant, based on posterior mass near domain edges, emerges as the most robust across tasks, improving both posterior fidelity and domain sampling for downstream domain randomisation in RL. Empirically, adaptive support leads to stronger object-centric policy adaptation and improved real-world performance on several DLO instances, while also highlighting the need for careful tuning and a feasible domain definition. The work suggests a path toward relaxing prior assumptions in LFI and integrating adaptive support with Bed/BO-inspired design, with future directions including automatic feasibility checks and more efficient inference steps.

Abstract

In robotics, likelihood-free inference (LFI) can provide the domain distribution that adapts a learnt agent in a parametric set of deployment conditions. LFI assumes an arbitrary support for sampling, which remains constant as the initial generic prior is iteratively refined to more descriptive posteriors. However, a potentially misspecified support can lead to suboptimal, yet falsely certain, posteriors. To address this issue, we propose three heuristic LFI variants: EDGE, MODE, and CENTRE. Each interprets the posterior mode shift over inference steps in its own way and, when integrated into an LFI step, adapts the support alongside posterior inference. We first expose the support misspecification issue and evaluate our heuristics using stochastic dynamical benchmarks. We then evaluate the impact of heuristic support adaptation on parameter inference and policy learning for a dynamic deformable linear object (DLO) manipulation task. Inference results in a finer length and stiffness classification for a parametric set of DLOs. When the resulting posteriors are used as domain distributions for sim-based policy learning, they lead to more robust object-centric agent performance.

Paper Structure

This paper contains 49 sections, 4 equations, 18 figures, 5 tables, 5 algorithms.

Figures (18)

  • Figure 1: The support misspecification issue for a visuomotor DLO whipping task (left, timelapse). On the centre (red boxes), we see two cases of how LFI struggles to accurately infer the Young's modulus and length posterior on a misspecified support, and the implications on domain samples drawn from the posterior. On the right (green boxes), we see how an adapted support leads to a more accurate inference result and more descriptive sets of domain samples. Orange arrows denote the accumulation of posterior density on domain boundaries, which signals potential misspecification. Blue arrows denote the corresponding adaptations, which stretch the domain.
  • Figure 2: Sequential refinement of Bayesian posterior approximations. Each posterior $\hat{p}_t$ is used to sample new parameters, generate simulations, and retrain the density estimator. This adaptive loop densifies coverage in high-likelihood regions.
  • Figure 3: Population counts over $30$ timesteps of $4$ sample Lotka-Volterra system simulations, recording with $dt=0.2$, for $(X, Y) = (50, 100)$ and $(\theta_1, \theta_2, \theta_3, \theta_4) = (0.01, 0.5, 1.0, 0.01)$, demonstrating stochastic variability in its oscillatory behaviour due to the MJP formulation.
  • Figure 4: Interdeparture time (idt) analysis for $3$ sample M/G/1 system simulations for $50$ jobs, demonstrating stochastic variability in jobs' completion.
  • Figure 5: The misspecified support issue and the difficulty of compensating with a broader support for Lotka-Volterra. We plot the progression of prior samples (top) and their respective MoG posteriors (bottom) along $10$ inference iterations. On prior samples scatterplots, the heatmap indicates the iteration each sample was drawn, the lighter the colour, the later the iteration. Each iteration samples' bounding box is plotted in dashed lines of the respective colour. The bigger dots mark the accumulated dataset's mean in each inference iteration. A black dashed line shows this trajectory. On MoG heatmaps, MoG progression is annotated with arrows pointing to the position of the components' weighted mean for each iteration and colorbars quantify likelihood. For coherence, we plot only the last iteration's posterior.
  • ...and 13 more figures