DOSE3 : Diffusion-based Out-of-distribution detection on SE(3) trajectories
Hongzhe Cheng, Tianyou Zheng, Tianyi Zhang, Matthew Johnson-Roberson, Weiming Zhi
TL;DR
This paper tackles OOD detection for sequences of rigid-body poses by extending diffusion models to the non-Euclidean manifold $\mathbb{SE}(3)$. It introduces DOSE3, a unified diffusion-based framework that diffuses both translations in $\mathbb{R}^3$ and rotations in $\mathbb{SO}(3)$, using manifold-aware forward and reverse processes and a diffusion estimator $\boldsymbol{\epsilon}_\theta$ to derive OOD statistics without retraining. A dedicated SE(3) diffusion UNet architecture with 1D temporal convolutions, attention, and residual connections enables effective modeling of pose trajectories; OOD detection is performed via a 24-dimensional diffusion-based statistic, with density estimation on inliers and a 5th percentile threshold for decision. Empirical validation on Oxford, KITTI, and IROS20 demonstrates near-perfect AUROC across ID/OOD combinations, with rotation information proving particularly discriminative and robust to sequence length and diffusion-step variations, highlighting practical impact for robotics and autonomous systems.
Abstract
Out-of-Distribution(OOD) detection, a fundamental machine learning task aimed at identifying abnormal samples, traditionally requires model retraining for different inlier distributions. While recent research demonstrates the applicability of diffusion models to OOD detection, existing approaches are limited to Euclidean or latent image spaces. Our work extends OOD detection to trajectories in the Special Euclidean Group in 3D ($\mathbb{SE}(3)$), addressing a critical need in computer vision, robotics, and engineering applications that process object pose sequences in $\mathbb{SE}(3)$. We present $\textbf{D}$iffusion-based $\textbf{O}$ut-of-distribution detection on $\mathbb{SE}(3)$ ($\mathbf{DOSE3}$), a novel OOD framework that extends diffusion to a unified sample space of $\mathbb{SE}(3)$ pose sequences. Through extensive validation on multiple benchmark datasets, we demonstrate $\mathbf{DOSE3}$'s superior performance compared to state-of-the-art OOD detection frameworks.
