Table of Contents
Fetching ...

Toward Accurate and Accessible Markerless Neuronavigation

Ziye Xie, Oded Schlesinger, Raj Kundu, Jessica Y. Choi, Pablo Iturralde, Dennis A. Turner, Stefan M. Goetz, Guillermo Sapiro, Angel V. Peterchev, J. Matias Di Martino

Abstract

Neuronavigation is widely used in biomedical research and interventions to guide the precise placement of instruments around the head to support procedures such as transcranial magnetic stimulation. Traditional systems, however, rely on subject-mounted markers that require manual registration, may shift during procedures, and can cause discomfort. We introduce and evaluate markerless approaches that replace expensive hardware and physical markers with low-cost visible and infrared light cameras incorporating stereo and depth sensing combined with algorithmic modeling of the facial geometry. Validation with $50$ human subjects yielded a median tracking discrepancy of only $2.32$ mm and $2.01°$ for the best markerless algorithms compared to a conventional marker-based system, which indicates sufficient accuracy for transcranial magnetic stimulation and a substantial improvement over prior markerless results. The results suggest that integration of the data from the various camera sensors can improve the overall accuracy further. The proposed markerless neuronavigation methods can reduce setup cost and complexity, improve patient comfort, and expand access to neuronavigation in clinical and research settings.

Toward Accurate and Accessible Markerless Neuronavigation

Abstract

Neuronavigation is widely used in biomedical research and interventions to guide the precise placement of instruments around the head to support procedures such as transcranial magnetic stimulation. Traditional systems, however, rely on subject-mounted markers that require manual registration, may shift during procedures, and can cause discomfort. We introduce and evaluate markerless approaches that replace expensive hardware and physical markers with low-cost visible and infrared light cameras incorporating stereo and depth sensing combined with algorithmic modeling of the facial geometry. Validation with human subjects yielded a median tracking discrepancy of only mm and for the best markerless algorithms compared to a conventional marker-based system, which indicates sufficient accuracy for transcranial magnetic stimulation and a substantial improvement over prior markerless results. The results suggest that integration of the data from the various camera sensors can improve the overall accuracy further. The proposed markerless neuronavigation methods can reduce setup cost and complexity, improve patient comfort, and expand access to neuronavigation in clinical and research settings.
Paper Structure (35 sections, 10 equations, 9 figures, 7 tables)

This paper contains 35 sections, 10 equations, 9 figures, 7 tables.

Figures (9)

  • Figure 1: Illustration of conventional marker-based and proposed markerless neuronavigation setups. (a) A subject wearing a tracker with retroreflective markers used by commercial neuronavigation systems. (b) Experimental apparatus developed in this work, consisting of two Azure Kinect DK cameras that capture color and depth data for markerless tracking (bottom) and, for comparison, a standard NDI Polaris Vicra stereotaxy camera tracking the retroreflective markers (top).
  • Figure 2: The markerless tracking system comprises a pair of Azure Kinect DK devices. The devices are synchronized and spatially calibrated (thick horizontal black line). Each device contains an RGB and a depth camera (labeled as $S_{RGB}$ and $S_{depth}$, respectively), operating simultaneously (thin horizontal black lines). We implemented and compared three neuronavigation markerless tracking strategies: Monocular RGB (Section \ref{['subsubsec:rgb_mono_method']}), Stereo RGB (Section \ref{['subsubsec:rgb_stereo_method']}), and Depth (Section \ref{['subsubsec:depth_tracking']}). We compared these techniques with and without the use of statistical head priors (Section \ref{['subsubsec:head_modeling']}), leading to a final set of six tracking alternatives. The illustration of the statistical head models is adapted from Ploumpis et al. ploumpis2020towards.
  • Figure 3: Notation: we adopt a pinhole camera model to represent the geometry of the captured data in the camera and reference coordinate frames, denoted as $S_{cam}$ and $S_{ref}$, respectively, (see (a)). 3D transformations between frames is represented as ${}_{destination}T_{source}:=\ S_{source}\rightarrow S_{destination}$. Each Azure device has two sensors and associated coordinate frames (RGB and depth) (see (b)). In addition, the head keypoint will be represented in a coordinate frame fixed to the head ($S_{head}$). External data collected for validation, e.g., with a commercial standard NDI system (see Section \ref{['subsec:experimental_setup']} for details), has its own reference frame ($S_{NDI}$). Subindices will be used to denote the reference frame in which points, sets of points, and point clouds are expressed. See Section \ref{['subsec:notation']} for details.
  • Figure 4: (a) Sample RGB input frame from the reference Azure camera (left camera), (b) set of MediaPipe facial landmarks detected in the frame, and (c) subset of landmarks used for tracking (for details on landmark selection and ablation studies, see \ref{['supp:subsec:rgb_ablation_study']}).
  • Figure 5: Creation of the 3D facial template for point cloud tracking. (a) Sample RGB input frame. (b) MediaPipe landmarks detected in the RGB image. (c) Convex hull of the landmarks used to define the facial region of interest. (d) Color information is mapped into the 3D reference frame to select the point cloud region of interest, and locate 3D landmarks. (e) Several point clouds obtained during a scanning session are aligned and merged to define the facial 3D template (shape). (f) Reference facial 3D template.
  • ...and 4 more figures