Markerless Tracking-Based Registration for Medical Image Motion Correction
Luisa Neubig, Deirdre Larsen, Takeshi Ikuma, Markus Kopp, Melda Kunduk, Andreas M. Kist
TL;DR
This paper tackles head-motion interference in videofluoroscopic swallow studies (VFSS) by introducing a markerless-tracking–based motion-correction pipeline that derives velocity fields to suppress unwanted motion while preserving swallow dynamics. It systematically compares markerless trackers with classic and deep optical-flow methods and against established image-registration methods, finding that a CoTracker-based, dense-grid velocity field approach yields superior structural similarity and competitive error metrics. The study demonstrates that small, trajectory-based velocity fields can achieve effective motion correction and generalize across hospitals, outperforming LDDMM, ANTs, and VoxelMorph on motion-dominated data and preserving anatomical fidelity. The work suggests practical benefits for quantitative swallowing analysis and points to enhancements via pre-processing (e.g., denoising) or auxiliary segmentation to further boost tracking quality and registration robustness.
Abstract
Our study focuses on isolating swallowing dynamics from interfering patient motion in videofluoroscopy, an X-ray technique that records patients swallowing a radiopaque bolus. These recordings capture multiple motion sources, including head movement, anatomical displacements, and bolus transit. To enable precise analysis of swallowing physiology, we aim to eliminate distracting motion, particularly head movement, while preserving essential swallowing-related dynamics. Optical flow methods fail due to artifacts like flickering and instability, making them unreliable for distinguishing different motion groups. We evaluated markerless tracking approaches (CoTracker, PIPs++, TAP-Net) and quantified tracking accuracy in key medical regions of interest. Our findings show that even sparse tracking points generate morphing displacement fields that outperform leading registration methods such as ANTs, LDDMM, and VoxelMorph. To compare all approaches, we assessed performance using MSE and SSIM metrics post-registration. We introduce a novel motion correction pipeline that effectively removes disruptive motion while preserving swallowing dynamics and surpassing competitive registration techniques.
