Table of Contents
Fetching ...

Shap-MeD

Nicolás Laverde, Melissa Robles, Johan Rodríguez

TL;DR

Shap-MeD tackles the problem of rapid, text- or image-conditioned 3D modeling for biomedical objects by fine-tuning OpenAI’s Shap-e on MedShapeNet’s biomedical meshes, achieving a notable reduction in latent-space MSE to $0.089$ compared with Shap-e’s $0.147$. The approach leverages implicit-function representations (NeRF/STF) and diffusion-based conditioning to produce structurally accurate anatomical models, with quantitative gains and qualitative demonstrations outperforming several baselines. The method is deployed via a Streamlit app to provide an accessible tool for surgical planning, education, and prosthetic design, potentially accelerating research prototypes and personalized treatments. While promising, the work acknowledges data quality issues and suggests future exploration of larger models (LRM/LGM) and enhanced biomedical datasets to further boost performance and reliability.

Abstract

We present Shap-MeD, a text-to-3D object generative model specialized in the biomedical domain. The objective of this study is to develop an assistant that facilitates the 3D modeling of medical objects, thereby reducing development time. 3D modeling in medicine has various applications, including surgical procedure simulation and planning, the design of personalized prosthetic implants, medical education, the creation of anatomical models, and the development of research prototypes. To achieve this, we leverage Shap-e, an open-source text-to-3D generative model developed by OpenAI, and fine-tune it using a dataset of biomedical objects. Our model achieved a mean squared error (MSE) of 0.089 in latent generation on the evaluation set, compared to Shap-e's MSE of 0.147. Additionally, we conducted a qualitative evaluation, comparing our model with others in the generation of biomedical objects. Our results indicate that Shap-MeD demonstrates higher structural accuracy in biomedical object generation.

Shap-MeD

TL;DR

Shap-MeD tackles the problem of rapid, text- or image-conditioned 3D modeling for biomedical objects by fine-tuning OpenAI’s Shap-e on MedShapeNet’s biomedical meshes, achieving a notable reduction in latent-space MSE to compared with Shap-e’s . The approach leverages implicit-function representations (NeRF/STF) and diffusion-based conditioning to produce structurally accurate anatomical models, with quantitative gains and qualitative demonstrations outperforming several baselines. The method is deployed via a Streamlit app to provide an accessible tool for surgical planning, education, and prosthetic design, potentially accelerating research prototypes and personalized treatments. While promising, the work acknowledges data quality issues and suggests future exploration of larger models (LRM/LGM) and enhanced biomedical datasets to further boost performance and reliability.

Abstract

We present Shap-MeD, a text-to-3D object generative model specialized in the biomedical domain. The objective of this study is to develop an assistant that facilitates the 3D modeling of medical objects, thereby reducing development time. 3D modeling in medicine has various applications, including surgical procedure simulation and planning, the design of personalized prosthetic implants, medical education, the creation of anatomical models, and the development of research prototypes. To achieve this, we leverage Shap-e, an open-source text-to-3D generative model developed by OpenAI, and fine-tune it using a dataset of biomedical objects. Our model achieved a mean squared error (MSE) of 0.089 in latent generation on the evaluation set, compared to Shap-e's MSE of 0.147. Additionally, we conducted a qualitative evaluation, comparing our model with others in the generation of biomedical objects. Our results indicate that Shap-MeD demonstrates higher structural accuracy in biomedical object generation.

Paper Structure

This paper contains 24 sections, 10 equations, 12 figures, 1 table.

Figures (12)

  • Figure 1: Current applications of 3D printing in medicine and healthcare. img
  • Figure 2: Transformer-Based Diffusion Architecture for Generating a Semantically Coherent Point Cloud from an Image and a Normally Distributed Noisy Point Cloud Point-E
  • Figure 3: Encoder representation of Shap-e. Obtained from Shap-E.
  • Figure 4: Comparison of Shap-e vs. Point-e results. Image obtained from Shap-E.
  • Figure 5: Comparison of results: LGM vs. Shap-e
  • ...and 7 more figures