Table of Contents
Fetching ...

Robotic Eye-in-hand Visual Servo Axially Aligning Nasopharyngeal Swabs with the Nasal Cavity

Peter Q. Lee, John S. Zelek, Katja Mombaur

TL;DR

This work tackles pre-contact alignment of nasopharyngeal swabs for autonomous robotic sampling in unrestrained patients. It presents an eye-in-hand visual servo pipeline that combines a joint lookup table for rapid IK initialization, 3D face pose estimation via 3DDFA_V2, UKF-M state estimation, and PBVS to guide the swab tip to the nostril. Validated on 25 participants, the system achieved nostril access in 84% of trials and showed no significant demographic biases within the sample; however, some failures were linked to nostril detection accuracy and bounding box initialization. The study highlights the importance of UKF-M for noise attenuation and discusses practical improvements, such as a dedicated nostril detector and potential multi-camera setups to support the contact phase, enabling safer and more consistent NP swab robotics.

Abstract

The nasopharyngeal (NP) swab test is a method for collecting cultures to diagnose for different types of respiratory illnesses, including COVID-19. Delegating this task to robots would be beneficial in terms of reducing infection risks and bolstering the healthcare system, but a critical component of the NP swab test is having the swab aligned properly with the nasal cavity so that it does not cause excessive discomfort or injury by traveling down the wrong passage. Existing research towards robotic NP swabbing typically assumes the patient's head is held within a fixture. This simplifies the alignment problem, but is also dissimilar to clinical scenarios where patients are typically free-standing. Consequently, our work creates a vision-guided pipeline to allow an instrumented robot arm to properly position and orient NP swabs with respect to the nostrils of free-standing patients. The first component of the pipeline is a precomputed joint lookup table to allow the arm to meet the patient's arbitrary position in the designated workspace, while avoiding joint limits. Our pipeline leverages semantic face models from computer vision to estimate the Euclidean pose of the face with respect to a monocular RGB-D camera placed on the end-effector. These estimates are passed into an unscented Kalman filter on manifolds state estimator and a pose based visual servo control loop to move the swab to the designated pose in front of the nostril. Our pipeline was validated with human trials, featuring a cohort of 25 participants. The system is effective, reaching the nostril for 84% of participants, and our statistical analysis did not find significant demographic biases within the cohort.

Robotic Eye-in-hand Visual Servo Axially Aligning Nasopharyngeal Swabs with the Nasal Cavity

TL;DR

This work tackles pre-contact alignment of nasopharyngeal swabs for autonomous robotic sampling in unrestrained patients. It presents an eye-in-hand visual servo pipeline that combines a joint lookup table for rapid IK initialization, 3D face pose estimation via 3DDFA_V2, UKF-M state estimation, and PBVS to guide the swab tip to the nostril. Validated on 25 participants, the system achieved nostril access in 84% of trials and showed no significant demographic biases within the sample; however, some failures were linked to nostril detection accuracy and bounding box initialization. The study highlights the importance of UKF-M for noise attenuation and discusses practical improvements, such as a dedicated nostril detector and potential multi-camera setups to support the contact phase, enabling safer and more consistent NP swab robotics.

Abstract

The nasopharyngeal (NP) swab test is a method for collecting cultures to diagnose for different types of respiratory illnesses, including COVID-19. Delegating this task to robots would be beneficial in terms of reducing infection risks and bolstering the healthcare system, but a critical component of the NP swab test is having the swab aligned properly with the nasal cavity so that it does not cause excessive discomfort or injury by traveling down the wrong passage. Existing research towards robotic NP swabbing typically assumes the patient's head is held within a fixture. This simplifies the alignment problem, but is also dissimilar to clinical scenarios where patients are typically free-standing. Consequently, our work creates a vision-guided pipeline to allow an instrumented robot arm to properly position and orient NP swabs with respect to the nostrils of free-standing patients. The first component of the pipeline is a precomputed joint lookup table to allow the arm to meet the patient's arbitrary position in the designated workspace, while avoiding joint limits. Our pipeline leverages semantic face models from computer vision to estimate the Euclidean pose of the face with respect to a monocular RGB-D camera placed on the end-effector. These estimates are passed into an unscented Kalman filter on manifolds state estimator and a pose based visual servo control loop to move the swab to the designated pose in front of the nostril. Our pipeline was validated with human trials, featuring a cohort of 25 participants. The system is effective, reaching the nostril for 84% of participants, and our statistical analysis did not find significant demographic biases within the cohort.
Paper Structure (18 sections, 14 equations, 13 figures, 1 table)

This paper contains 18 sections, 14 equations, 13 figures, 1 table.

Figures (13)

  • Figure 1: Robotic arm and attached end-effector, equipped with a camera and an electromagnetic mechanism for grasping an NP swab.
  • Figure 2: Overview of the stages of motion during the pre-contact phase of the NP swab test.
  • Figure 3: Flowchart of the components used throughout the three stages of the pre-contact phase.
  • Figure 4: Joint lookup table that spans the arm's workspace cone $\mathcal{C}_{\text{start}}$ (\ref{['eq:cone_start']}) that represents different possible positions a patient's face can be located at. The color represents the achieved quality of the joint configuration stored in the lookup table. The joint positions for green points can reach a high number of points in the destination $\mathcal{C}_{\text{end}}$, while red points reach a low number of points or are infeasible to reach.
  • Figure 5: Illustration of the cone $\mathcal{C}_{\text{end}}$ (\ref{['eq:cone_end']}) in green extended from a starting point $\bm{\epsilon}_{\text{start}}$ (red crosses). During generation of the lookup table, candidate joint configurations are moved to points discretely sampled within this cone (blue dots) to determine how well the arm could move around the face from the starting configuration.
  • ...and 8 more figures