Table of Contents
Fetching ...

Markerless Tracking-Based Registration for Medical Image Motion Correction

Luisa Neubig, Deirdre Larsen, Takeshi Ikuma, Markus Kopp, Melda Kunduk, Andreas M. Kist

TL;DR

This paper tackles head-motion interference in videofluoroscopic swallow studies (VFSS) by introducing a markerless-tracking–based motion-correction pipeline that derives velocity fields to suppress unwanted motion while preserving swallow dynamics. It systematically compares markerless trackers with classic and deep optical-flow methods and against established image-registration methods, finding that a CoTracker-based, dense-grid velocity field approach yields superior structural similarity and competitive error metrics. The study demonstrates that small, trajectory-based velocity fields can achieve effective motion correction and generalize across hospitals, outperforming LDDMM, ANTs, and VoxelMorph on motion-dominated data and preserving anatomical fidelity. The work suggests practical benefits for quantitative swallowing analysis and points to enhancements via pre-processing (e.g., denoising) or auxiliary segmentation to further boost tracking quality and registration robustness.

Abstract

Our study focuses on isolating swallowing dynamics from interfering patient motion in videofluoroscopy, an X-ray technique that records patients swallowing a radiopaque bolus. These recordings capture multiple motion sources, including head movement, anatomical displacements, and bolus transit. To enable precise analysis of swallowing physiology, we aim to eliminate distracting motion, particularly head movement, while preserving essential swallowing-related dynamics. Optical flow methods fail due to artifacts like flickering and instability, making them unreliable for distinguishing different motion groups. We evaluated markerless tracking approaches (CoTracker, PIPs++, TAP-Net) and quantified tracking accuracy in key medical regions of interest. Our findings show that even sparse tracking points generate morphing displacement fields that outperform leading registration methods such as ANTs, LDDMM, and VoxelMorph. To compare all approaches, we assessed performance using MSE and SSIM metrics post-registration. We introduce a novel motion correction pipeline that effectively removes disruptive motion while preserving swallowing dynamics and surpassing competitive registration techniques.

Markerless Tracking-Based Registration for Medical Image Motion Correction

TL;DR

This paper tackles head-motion interference in videofluoroscopic swallow studies (VFSS) by introducing a markerless-tracking–based motion-correction pipeline that derives velocity fields to suppress unwanted motion while preserving swallow dynamics. It systematically compares markerless trackers with classic and deep optical-flow methods and against established image-registration methods, finding that a CoTracker-based, dense-grid velocity field approach yields superior structural similarity and competitive error metrics. The study demonstrates that small, trajectory-based velocity fields can achieve effective motion correction and generalize across hospitals, outperforming LDDMM, ANTs, and VoxelMorph on motion-dominated data and preserving anatomical fidelity. The work suggests practical benefits for quantitative swallowing analysis and points to enhancements via pre-processing (e.g., denoising) or auxiliary segmentation to further boost tracking quality and registration robustness.

Abstract

Our study focuses on isolating swallowing dynamics from interfering patient motion in videofluoroscopy, an X-ray technique that records patients swallowing a radiopaque bolus. These recordings capture multiple motion sources, including head movement, anatomical displacements, and bolus transit. To enable precise analysis of swallowing physiology, we aim to eliminate distracting motion, particularly head movement, while preserving essential swallowing-related dynamics. Optical flow methods fail due to artifacts like flickering and instability, making them unreliable for distinguishing different motion groups. We evaluated markerless tracking approaches (CoTracker, PIPs++, TAP-Net) and quantified tracking accuracy in key medical regions of interest. Our findings show that even sparse tracking points generate morphing displacement fields that outperform leading registration methods such as ANTs, LDDMM, and VoxelMorph. To compare all approaches, we assessed performance using MSE and SSIM metrics post-registration. We introduce a novel motion correction pipeline that effectively removes disruptive motion while preserving swallowing dynamics and surpassing competitive registration techniques.

Paper Structure

This paper contains 16 sections, 6 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: VFSS recording motion sources and their estimation. a) shows a VFSS frame, where the blue circles mark the static background, yellow arrows indicate the moving head, and the bolus with its moving direction is highlighted in purple. b) presents the normalized summed flow fields throughout the same recording as in a) for Deep Learning-based optical flow methods (consisting of UnFlow, SPyNet, Flowformer, RAFT, and PWC-Net), the Farneback and Lucas-Kanade algorithm as well as one tracking-based algorithm (CoTracker).
  • Figure 2: Markerless tracking reliably recovers moving parts. a) MAPE evaluation of the tracking algorithms for three landmarks. b) Visualizes the characteristics of the point tracks with individual trajectories. The magenta-blue color gradient encodes the transition from $t=0$ to $t=T$. The upper and lower panels in a) and b) show the results for images with and without unwanted motion respectively.
  • Figure 3: Analysis of the influence of the grid size on the MSE and SSIM per video for the three evaluated tracking algorithms CoTracker, PIPs++, and TAP-Net.
  • Figure 4: Visual comparison of the last image of the morphed sequence (top row) and the difference between the last and first images (bottom row) for CoTracker, ANTs, LDDMM, and VoxelMorph. Difference image: $P\in [white > gray=0 > black]$