MAESIL: Masked Autoencoder for Enhanced Self-supervised Medical Image Learning

Kyeonghun Kim; Hyeonseok Jung; Youngung Han; Junsu Lim; YeonJu Jean; Seongbin Park; Eunseob Choi; Hyunsu Go; SeoYoung Ju; Seohyoung Park; Gyeongmin Kim; MinJu Kwon; KyungSeok Yuh; Soo Yong Kim; Ken Ying-Kai Liao; Nam-Joon Kim; Hyuk-Jae Lee

MAESIL: Masked Autoencoder for Enhanced Self-supervised Medical Image Learning

Kyeonghun Kim, Hyeonseok Jung, Youngung Han, Junsu Lim, YeonJu Jean, Seongbin Park, Eunseob Choi, Hyunsu Go, SeoYoung Ju, Seohyoung Park, Gyeongmin Kim, MinJu Kwon, KyungSeok Yuh, Soo Yong Kim, Ken Ying-Kai Liao, Nam-Joon Kim, Hyuk-Jae Lee

Abstract

Training deep learning models for three-dimensional (3D) medical imaging, such as Computed Tomography (CT), is fundamentally challenged by the scarcity of labeled data. While pre-training on natural images is common, it results in a significant domain shift, limiting performance. Self-Supervised Learning (SSL) on unlabeled medical data has emerged as a powerful solution, but prominent frameworks often fail to exploit the inherent 3D nature of CT scans. These methods typically process 3D scans as a collection of independent 2D slices, an approach that fundamentally discards critical axial coherence and the 3D structural context. To address this limitation, we propose the autoencoder for enhanced self-supervised medical image learning(MAESIL), a novel self-supervised learning framework designed to capture 3D structural information efficiently. The core innovation is the 'superpatch', a 3D chunk-based input unit that balances 3D context preservation with computational efficiency. Our framework partitions the volume into superpatches and employs a 3D masked autoencoder strategy with a dual-masking strategy to learn comprehensive spatial representations. We validated our approach on three diverse large-scale public CT datasets. Our experimental results show that MAESIL demonstrates significant improvements over existing methods such as AE, VAE and VQ-VAE in key reconstruction metrics such as PSNR and SSIM. This establishes MAESIL as a robust and practical pre-training solution for 3D medical imaging tasks.

MAESIL: Masked Autoencoder for Enhanced Self-supervised Medical Image Learning

Abstract

MAESIL: Masked Autoencoder for Enhanced Self-supervised Medical Image Learning

Abstract

Paper Structure

Table of Contents

Figures (3)