Table of Contents
Fetching ...

Deep Learning for Cancer Prognosis Prediction Using Portrait Photos by StyleGAN Embedding

Amr Hagag, Ahmed Gomaa, Dominik Kornek, Andreas Maier, Rainer Fietkau, Christoph Bert, Florian Putz, Yixing Huang

TL;DR

The study tackles prognosis prediction for cancer patients by exploring whether prognostic information can be extracted from observable facial features in 2D portrait photos. It leverages a fine-tuned StyleGAN2 to embed each portrait into a latent space and uses CoxPH/DeepSurv models, with optional late fusion with clinical data, to predict survival. In pan-cancer analyses, the approach achieves a mean $C$-index of $0.680$ and a mean $Brier Score$ of $0.200$, outperforming end-to-end CNN baselines and approaching clinical models, with late fusion elevating performance to $0.787$ $C$-index, demonstrating complementary information. Importantly, the latent space supports health-attribute manipulations that yield interpretable prognosis-related facial changes, offering a transparent and potentially impactful tool for patient care and communication.

Abstract

Survival prediction for cancer patients is critical for optimal treatment selection and patient management. Current patient survival prediction methods typically extract survival information from patients' clinical record data or biological and imaging data. In practice, experienced clinicians can have a preliminary assessment of patients' health status based on patients' observable physical appearances, which are mainly facial features. However, such assessment is highly subjective. In this work, the efficacy of objectively capturing and using prognostic information contained in conventional portrait photographs using deep learning for survival predication purposes is investigated for the first time. A pre-trained StyleGAN2 model is fine-tuned on a custom dataset of our cancer patients' photos to empower its generator with generative ability suitable for patients' photos. The StyleGAN2 is then used to embed the photographs to its highly expressive latent space. Utilizing the state-of-the-art survival analysis models and based on StyleGAN's latent space photo embeddings, this approach achieved a C-index of 0.677, which is notably higher than chance and evidencing the prognostic value embedded in simple 2D facial images. In addition, thanks to StyleGAN's interpretable latent space, our survival prediction model can be validated for relying on essential facial features, eliminating any biases from extraneous information like clothing or background. Moreover, a health attribute is obtained from regression coefficients, which has important potential value for patient care.

Deep Learning for Cancer Prognosis Prediction Using Portrait Photos by StyleGAN Embedding

TL;DR

The study tackles prognosis prediction for cancer patients by exploring whether prognostic information can be extracted from observable facial features in 2D portrait photos. It leverages a fine-tuned StyleGAN2 to embed each portrait into a latent space and uses CoxPH/DeepSurv models, with optional late fusion with clinical data, to predict survival. In pan-cancer analyses, the approach achieves a mean -index of and a mean of , outperforming end-to-end CNN baselines and approaching clinical models, with late fusion elevating performance to -index, demonstrating complementary information. Importantly, the latent space supports health-attribute manipulations that yield interpretable prognosis-related facial changes, offering a transparent and potentially impactful tool for patient care and communication.

Abstract

Survival prediction for cancer patients is critical for optimal treatment selection and patient management. Current patient survival prediction methods typically extract survival information from patients' clinical record data or biological and imaging data. In practice, experienced clinicians can have a preliminary assessment of patients' health status based on patients' observable physical appearances, which are mainly facial features. However, such assessment is highly subjective. In this work, the efficacy of objectively capturing and using prognostic information contained in conventional portrait photographs using deep learning for survival predication purposes is investigated for the first time. A pre-trained StyleGAN2 model is fine-tuned on a custom dataset of our cancer patients' photos to empower its generator with generative ability suitable for patients' photos. The StyleGAN2 is then used to embed the photographs to its highly expressive latent space. Utilizing the state-of-the-art survival analysis models and based on StyleGAN's latent space photo embeddings, this approach achieved a C-index of 0.677, which is notably higher than chance and evidencing the prognostic value embedded in simple 2D facial images. In addition, thanks to StyleGAN's interpretable latent space, our survival prediction model can be validated for relying on essential facial features, eliminating any biases from extraneous information like clothing or background. Moreover, a health attribute is obtained from regression coefficients, which has important potential value for patient care.
Paper Structure (5 sections, 4 equations, 5 figures, 1 table)

This paper contains 5 sections, 4 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Exemplary anonymized face images and their corresponding synthetic images generated by StyleGAN without (w/o) and with fine-tuning (FT).
  • Figure 2: The workflow of our proposed facial prognosis method: (a) Optimization of $\boldsymbol{z}$ using StyleGAN generator fine-tuned on custom cancer patient data; (b) CoxPH regression to get $\boldsymbol{w}$ for health attribute construction and latent space manipulation; (c) DeepSurv for survival prediction; (d) Generation of new appearance.
  • Figure 3: A subgroup analysis on 16 cancer types with corresponding single C-indices. The dashed line represents the average C-index of the pan-cancer analysis.
  • Figure 4: Manipulating the health attribute with different $\beta$ values for regenerated patients' images. The synthesized image without manipulation is at $\beta=0$ (third column).
  • Figure 5: Exemplary patients' images generated by manipulating the age attribute.