Table of Contents
Fetching ...

LucidAtlas: Learning Uncertainty-Aware, Covariate-Disentangled, Individualized Atlas Representations

Yining Jiao, Sreekalyani Bhamidi, Huaizhi Qu, Carlton Zdanski, Julia Kimbell, Andrew Prince, Cameron Worden, Samuel Kirse, Christopher Rutter, Benjamin Shields, William Dunn, Jisan Mahmud, Tianlong Chen, Marc Niethammer

TL;DR

LucidAtlas introduces an uncertainty-aware, covariate-disentangled atlas representation that couples spatial dependencies with neural additive models to capture population trends and variability in medical data. It extends NAMs via a marginalized covariate interpretation framework, enabling robust covariate marginalization and monotonic priors to improve interpretability. The method is validated on pediatric airway geometry and the OASIS brain volumes, demonstrating superior population-trend accuracy, distribution modeling, and individualized prediction capabilities while addressing risks in dependent covariates. The work offers a principled, interpretable, and trustworthy atlas framework with potential for broader clinical impact and future extensions to non-Gaussian and non-continuous covariates.

Abstract

The goal of this work is to develop principled techniques to extract information from high dimensional data sets with complex dependencies in areas such as medicine that can provide insight into individual as well as population level variation. We develop $\texttt{LucidAtlas}$, an approach that can represent spatially varying information, and can capture the influence of covariates as well as population uncertainty. As a versatile atlas representation, $\texttt{LucidAtlas}$ offers robust capabilities for covariate interpretation, individualized prediction, population trend analysis, and uncertainty estimation, with the flexibility to incorporate prior knowledge. Additionally, we discuss the trustworthiness and potential risks of neural additive models for analyzing dependent covariates and then introduce a marginalization approach to explain the dependence of an individual predictor on the models' response (the atlas). To validate our method, we demonstrate its generalizability on two medical datasets. Our findings underscore the critical role of by-construction interpretable models in advancing scientific discovery. Our code will be publicly available upon acceptance.

LucidAtlas: Learning Uncertainty-Aware, Covariate-Disentangled, Individualized Atlas Representations

TL;DR

LucidAtlas introduces an uncertainty-aware, covariate-disentangled atlas representation that couples spatial dependencies with neural additive models to capture population trends and variability in medical data. It extends NAMs via a marginalized covariate interpretation framework, enabling robust covariate marginalization and monotonic priors to improve interpretability. The method is validated on pediatric airway geometry and the OASIS brain volumes, demonstrating superior population-trend accuracy, distribution modeling, and individualized prediction capabilities while addressing risks in dependent covariates. The work offers a principled, interpretable, and trustworthy atlas framework with potential for broader clinical impact and future extensions to non-Gaussian and non-continuous covariates.

Abstract

The goal of this work is to develop principled techniques to extract information from high dimensional data sets with complex dependencies in areas such as medicine that can provide insight into individual as well as population level variation. We develop , an approach that can represent spatially varying information, and can capture the influence of covariates as well as population uncertainty. As a versatile atlas representation, offers robust capabilities for covariate interpretation, individualized prediction, population trend analysis, and uncertainty estimation, with the flexibility to incorporate prior knowledge. Additionally, we discuss the trustworthiness and potential risks of neural additive models for analyzing dependent covariates and then introduce a marginalization approach to explain the dependence of an individual predictor on the models' response (the atlas). To validate our method, we demonstrate its generalizability on two medical datasets. Our findings underscore the critical role of by-construction interpretable models in advancing scientific discovery. Our code will be publicly available upon acceptance.

Paper Structure

This paper contains 33 sections, 14 equations, 3 figures, 5 tables.

Figures (3)

  • Figure 1: LucidAtlas: Learning an Uncertainty-Aware, Covariate-Disentangled, Individualized Atlas Representation. ① As an example use case, we depict an airway with its anatomical landmarks at different depths (i.e., anatomical location) along its centerline atlashong2013pediatric. ② During training, each subnetwork $f_i(c_i, x)$ receives the location $x$ and covariate $c_i$ as input to predict the covariate-specific distributional parameters $f^m_i$ and $f^v_i$, which are added to obtain the overall distributional parameters to capture the population trend and variation as $f^m=\sum_i f^m_i$ and $f^v=\sum_i f^v_i$ respectively. ③ The goal of marginalization is to discover $p(y|c_i, x)$, by integrating out the potentially dependent covariates $\{c_k\}_{k \neq i}$. Each subnetwork $g_i(c_i)$ receives covariate $c_i$ to parameterize a multivariate Gaussian distribution $p(\boldsymbol{c}|c_i)$ for all $N$ covariates, from which we obtain $p(c_k|c_i)$ and $p(c_{K_1}, c_{K_2}|c_i)$. The marginalization requires that the outputs of $\{f_i\}$ and $\{g_i\}$ are as described in Sec. \ref{['subsubsec.how_marg']}. ④ LucidAtlas can obtain different interpretations, i.e., 1) a covariate disentanglement corresponding to each covariate's additive effect from $\{f_i\}$; 2) dependence between covariates modeled by $\{g_i\}$ as $p(c_k|c_i)$ and 3) a marginalization illustrating the overall impact from each covariate on the predicted response (here the cross-sectional area $y$ at a specific location $x$) via marginalization. ⑤ Monotonic neural networks by construction are used if the influence of a covariate on the response is assumed to be monotonic based on prior knowledge/domain knowledge; otherwise, a multi-layer perceptron (MLP) is used to parameterize the subnetworks.
  • Figure 2: Visualizations of Covariate Interpretations from LucidAtlas for CSA Distribution at the Subglottis Landmark (Pediatric Airway Dataset). (1) $f_i(c_i)$ represents the disentangled covariate effect directly from a NAM as illustrated in Sec. \ref{['sec.dist_cov_effects']}; (2) Marginalized covariate interpretation without accounting for covariate dependence; (3) Marginalized covariate interpretation incorporating covariate dependence. Green and purple dots indicate training and testing samples respectively. The red lines represent the learned population trend, and the gray shading spans $\pm 2 \times$ standard deviations. Considering covariate dependence is essential for accurately capturing how each covariate influences the population trend and associated uncertainties.
  • Figure 3: Visualizations of the Effect of Prior Knowledge in LucidAtlas at the Subglottis Landmark (Pediatric Airway Dataset). The $\times$ symbol indicates the covariate interpretation contradicts prior knowledge, such as the NAM incorrectly interpreting airway CSA as decreasing with a child's weight. Without incorporating prior knowledge, the model may deviate form our prior assumptions. Without marginalization, to account for covariate dependencies, the data may not be fit well.