Table of Contents
Fetching ...

RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representations from X-Ray Images

Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

TL;DR

RGMIM can mask more valid regions, facilitating the learning of discriminative representations and the subsequent high-accuracy lung disease detection and the subsequent high-accuracy lung disease detection.

Abstract

In this study, we propose a novel method called region-guided masked image modeling (RGMIM) for learning meaningful representations from X-ray images. Our method adopts a new masking strategy that utilizes organ mask information to identify valid regions for learning more meaningful representations. We conduct quantitative evaluations on an open lung X-ray image dataset as well as masking ratio hyperparameter studies. When using the entire training set, RGMIM outperformed other comparable methods, achieving a 0.962 lung disease detection accuracy. Specifically, RGMIM significantly improved performance in small data volumes, such as 5% and 10% of the training set compared to other methods. RGMIM can mask more valid regions, facilitating the learning of discriminative representations and the subsequent high-accuracy lung disease detection. RGMIM outperforms other state-of-the-art self-supervised learning methods in experiments, particularly when limited training data is used.

RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representations from X-Ray Images

TL;DR

RGMIM can mask more valid regions, facilitating the learning of discriminative representations and the subsequent high-accuracy lung disease detection and the subsequent high-accuracy lung disease detection.

Abstract

In this study, we propose a novel method called region-guided masked image modeling (RGMIM) for learning meaningful representations from X-ray images. Our method adopts a new masking strategy that utilizes organ mask information to identify valid regions for learning more meaningful representations. We conduct quantitative evaluations on an open lung X-ray image dataset as well as masking ratio hyperparameter studies. When using the entire training set, RGMIM outperformed other comparable methods, achieving a 0.962 lung disease detection accuracy. Specifically, RGMIM significantly improved performance in small data volumes, such as 5% and 10% of the training set compared to other methods. RGMIM can mask more valid regions, facilitating the learning of discriminative representations and the subsequent high-accuracy lung disease detection. RGMIM outperforms other state-of-the-art self-supervised learning methods in experiments, particularly when limited training data is used.
Paper Structure (10 sections, 5 equations, 3 figures, 3 tables)

This paper contains 10 sections, 5 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Comparison of the random masking and the proposed region-guided masking.
  • Figure 2: Overview of RGMIM. The left indicates the pipeline and the right show the structure of the ViT encoder.
  • Figure 3: lung disease classification accuracy as fine-tuning epoch number increases. All methods use the ViT-Base model.