Normalizing Flows are Capable Models for Bi-manual Visuomotor Policy
Jialong Li, Simon Kristoffersson Lind, Wenrui Xie, Maj Stenmark, Volker Krüger
TL;DR
Normalizing Flows Policy (NF-P) is introduced, a conditional normalizing flow-based visuomotor policy for bi-manual manipulation that learns a conditional density over action sequences and enables single-pass generative sampling with tractable likelihood computation.
Abstract
The field of general-purpose robotics has recently embraced powerful probabilistic diffusion-based models to learn the complex embodiment behaviours. However, existing models often come with significant trade-offs, namely high computational costs for inference and a fundamental inability to quantify output uncertainty. We introduce Normalizing Flows Policy (NF-P), a conditional normalizing flow-based visuomotor policy for bi-manual manipulation. NF-P learns a conditional density over action sequences and enables single-pass generative sampling with tractable likelihood computation. Using this property, we propose two inference-time optimization strategies: Stochastic Batch Selection, which selects the highest-likelihood trajectory among sampled candidates, and Gradient Refinement, which directly ascends the log-likelihood to improve action quality. In both simulation and real robot experiments, NF-P achieves promising success rates compared to the baseline. In addition to improved task performance, NF-P demonstrates faster training and lower inference latency. These results establish normalizing flows as a competitive and computationally efficient visuomotor policy, particularly for real-time, uncertainty-aware robotic control.
