Modality-Agnostic Style Transfer for Holistic Feature Imputation
Seunghun Baek, Jaeyoon Sim, Mustafa Dere, Minjeong Kim, Guorong Wu, Won Hwa Kim
TL;DR
This work tackles missing data in multi-modality neuroimaging for Alzheimer's disease by learning a modality-agnostic content embedding and modality-specific style generators to impute unobserved measures across $S$ modalities. It introduces a two-phase framework: Phase 1 uses domain adversarial training to extract content invariant to modality, while Phase 2 trains per-modality generators to inject target modality style without altering content, balancing realism and content preservation. On ADNI data, Cohen's $d$ averaged across ROIs was $0.188$ for the proposed method, substantially lower than $0.407$ (cGAN) and $0.261$ (WGAN), and the approach reduces the generator count from $S^2$ to $S$, enabling robust training on limited data. Imputed data improved downstream MCI classification performance across 2–4-layer MLPs, demonstrating practical utility and suggesting applicability to other neuroimaging datasets with missing modalities.
Abstract
Characterizing a preclinical stage of Alzheimer's Disease (AD) via single imaging is difficult as its early symptoms are quite subtle. Therefore, many neuroimaging studies are curated with various imaging modalities, e.g., MRI and PET, however, it is often challenging to acquire all of them from all subjects and missing data become inevitable. In this regards, in this paper, we propose a framework that generates unobserved imaging measures for specific subjects using their existing measures, thereby reducing the need for additional examinations. Our framework transfers modality-specific style while preserving AD-specific content. This is done by domain adversarial training that preserves modality-agnostic but AD-specific information, while a generative adversarial network adds an indistinguishable modality-specific style. Our proposed framework is evaluated on the Alzheimer's Disease Neuroimaging Initiative (ADNI) study and compared with other imputation methods in terms of generated data quality. Small average Cohen's $d$ $< 0.19$ between our generated measures and real ones suggests that the synthetic data are practically usable regardless of their modality type.
