Table of Contents
Fetching ...

Uncertainty-Aware Remaining Lifespan Prediction from Images

Tristan Kenneweg, Philip Kenneweg, Barbara Hammer

TL;DR

Addresses remaining lifespan prediction from facial and full-body images and assesses the information content for person-centric health screening. Proposes an uncertainty-aware regression framework using pretrained vision transformers with a Gaussian mean–variance head trained by Gaussian negative log-likelihood loss $\mathcal{L} = \frac{1}{2N} \sum_i (\log(\sigma_i^2) + \frac{(y_i - \mu_i)^2}{\sigma_i^2})$, estimating both the mean remaining lifespan $\mu$ and its uncertainty $\sigma$. Demonstrates state-of-the-art MAEs of $4.91$ years on Faces, $4.99$ years on Whole Images, and $7.41$ years on the Legacy dataset, with a bucketed calibration error of $0.82$ years on the Faces data, and releases code and curated datasets to enable replication and further research. The work highlights the potential to extract medically relevant signals from images and emphasizes uncertainty calibration as a key component for responsible, scalable health screening research.

Abstract

Predicting mortality-related outcomes from images offers the prospect of accessible, noninvasive, and scalable health screening. We present a method that leverages pretrained vision transformer foundation models to estimate remaining lifespan from facial and whole-body images, alongside robust uncertainty quantification. We show that predictive uncertainty varies systematically with the true remaining lifespan, and that this uncertainty can be effectively modeled by learning a Gaussian distribution for each sample. Our approach achieves state-of-the-art mean absolute error (MAE) of 7.41 years on an established dataset, and further achieves 4.91 and 4.99 years MAE on two new, higher-quality datasets curated and published in this work. Importantly, our models provide calibrated uncertainty estimates, as demonstrated by a bucketed expected calibration error of 0.82 years on the Faces Dataset. While not intended for clinical deployment, these results highlight the potential of extracting medically relevant signals from images. We make all code and datasets available to facilitate further research.

Uncertainty-Aware Remaining Lifespan Prediction from Images

TL;DR

Addresses remaining lifespan prediction from facial and full-body images and assesses the information content for person-centric health screening. Proposes an uncertainty-aware regression framework using pretrained vision transformers with a Gaussian mean–variance head trained by Gaussian negative log-likelihood loss , estimating both the mean remaining lifespan and its uncertainty . Demonstrates state-of-the-art MAEs of years on Faces, years on Whole Images, and years on the Legacy dataset, with a bucketed calibration error of years on the Faces data, and releases code and curated datasets to enable replication and further research. The work highlights the potential to extract medically relevant signals from images and emphasizes uncertainty calibration as a key component for responsible, scalable health screening research.

Abstract

Predicting mortality-related outcomes from images offers the prospect of accessible, noninvasive, and scalable health screening. We present a method that leverages pretrained vision transformer foundation models to estimate remaining lifespan from facial and whole-body images, alongside robust uncertainty quantification. We show that predictive uncertainty varies systematically with the true remaining lifespan, and that this uncertainty can be effectively modeled by learning a Gaussian distribution for each sample. Our approach achieves state-of-the-art mean absolute error (MAE) of 7.41 years on an established dataset, and further achieves 4.91 and 4.99 years MAE on two new, higher-quality datasets curated and published in this work. Importantly, our models provide calibrated uncertainty estimates, as demonstrated by a bucketed expected calibration error of 0.82 years on the Faces Dataset. While not intended for clinical deployment, these results highlight the potential of extracting medically relevant signals from images. We make all code and datasets available to facilitate further research.

Paper Structure

This paper contains 12 sections, 3 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: Dataset Images
  • Figure 2: Histogram of target values in the newly created datasets.
  • Figure 3: Loss curves from an L1-loss training run during which gradients were passed through all layers of the transformer backbone. Note that while the network is overfitting on the training data, this does not result in degraded performance on the test set.
  • Figure 4: Comparison of binned true and predicted errors. (a) shows the error distribution for the whole remaining lifespan range, with a few extreme outliers cut off. (b) Zooms in on the remaining lifespan range of zero to 20 years where most data points lie. The numbers above the bars indicate the number of elements in a given bucket in the test set. The graphs display results on the Faces Dataset.