Table of Contents
Fetching ...

Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion

Meng Zheng, Benjamin Planche, Xuan Gong, Fan Yang, Terrence Chen, Ziyan Wu

TL;DR

A generic modularized 3D patient modeling method which achieves superior patient positioning performance across various imaging modalities in real clinical scenarios, and a self-supervised 3D mesh regression module which does not require expensive 3D mesh parameter annotations to train, bringing immediate cost benefits for clinical deployment.

Abstract

3D patient body modeling is critical to the success of automated patient positioning for smart medical scanning and operating rooms. Existing CNN-based end-to-end patient modeling solutions typically require a) customized network designs demanding large amount of relevant training data, covering extensive realistic clinical scenarios (e.g., patient covered by sheets), which leads to suboptimal generalizability in practical deployment, b) expensive 3D human model annotations, i.e., requiring huge amount of manual effort, resulting in systems that scale poorly. To address these issues, we propose a generic modularized 3D patient modeling method consists of (a) a multi-modal keypoint detection module with attentive fusion for 2D patient joint localization, to learn complementary cross-modality patient body information, leading to improved keypoint localization robustness and generalizability in a wide variety of imaging (e.g., CT, MRI etc.) and clinical scenarios (e.g., heavy occlusions); and (b) a self-supervised 3D mesh regression module which does not require expensive 3D mesh parameter annotations to train, bringing immediate cost benefits for clinical deployment. We demonstrate the efficacy of the proposed method by extensive patient positioning experiments on both public and clinical data. Our evaluation results achieve superior patient positioning performance across various imaging modalities in real clinical scenarios.

Self-supervised 3D Patient Modeling with Multi-modal Attentive Fusion

TL;DR

A generic modularized 3D patient modeling method which achieves superior patient positioning performance across various imaging modalities in real clinical scenarios, and a self-supervised 3D mesh regression module which does not require expensive 3D mesh parameter annotations to train, bringing immediate cost benefits for clinical deployment.

Abstract

3D patient body modeling is critical to the success of automated patient positioning for smart medical scanning and operating rooms. Existing CNN-based end-to-end patient modeling solutions typically require a) customized network designs demanding large amount of relevant training data, covering extensive realistic clinical scenarios (e.g., patient covered by sheets), which leads to suboptimal generalizability in practical deployment, b) expensive 3D human model annotations, i.e., requiring huge amount of manual effort, resulting in systems that scale poorly. To address these issues, we propose a generic modularized 3D patient modeling method consists of (a) a multi-modal keypoint detection module with attentive fusion for 2D patient joint localization, to learn complementary cross-modality patient body information, leading to improved keypoint localization robustness and generalizability in a wide variety of imaging (e.g., CT, MRI etc.) and clinical scenarios (e.g., heavy occlusions); and (b) a self-supervised 3D mesh regression module which does not require expensive 3D mesh parameter annotations to train, bringing immediate cost benefits for clinical deployment. We demonstrate the efficacy of the proposed method by extensive patient positioning experiments on both public and clinical data. Our evaluation results achieve superior patient positioning performance across various imaging modalities in real clinical scenarios.
Paper Structure (7 sections, 5 figures, 6 tables)

This paper contains 7 sections, 5 figures, 6 tables.

Figures (5)

  • Figure 1: (a) Mesh representation of a patient in MRI scanning room. (b) Failure cases of state-of-the-art mesh regressors (SPIN kolotouros2019spin) in challenging clinical scenarios, e.g., simulated hospital environment SLP_2019, MRI and CT scanning rooms.
  • Figure 2: Proposed framework to localize 2D keypoints and infer the 3D mesh.
  • Figure 3: Proposed RGBD keypoint detection framework with attention fusion.
  • Figure 4: Performance comparison between proposed RGB and RGBD model.
  • Figure 5: Visualization of reconstructed mesh results on CT, MI, MRI and SLP.