Table of Contents
Fetching ...

Learning Low-Rank Feature for Thorax Disease Classification

Rajeev Goel, Utkarsh Nath, Yancheng Wang, Alvin C. Silva, Teresa Wu, Yingzhen Yang

TL;DR

The paper tackles thorax disease classification from chest X-rays by introducing Low-Rank Feature Learning (LRFL), a regularization framework that promotes low-rank, disease-focused features via a truncated nuclear-norm term appended to a linear classifier trained on MAE-pretrained backbones. It provides a theoretical sharp generalization bound for LRFL and a separable SGD-friendly approximation to optimize the regularizer, enabling practical use with CNNs and ViTs. Empirically, LRFL yields state-of-the-art or competitive results on NIH ChestX-ray14, COVIDx, and CheXpert, with notable gains in low-data regimes and improved localization of disease regions as shown by Grad-CAM. The approach is efficient, broadly applicable to various network architectures, and has practical impact for robust radiographic disease classification under noise and background variation.

Abstract

Deep neural networks, including Convolutional Neural Networks (CNNs) and Visual Transformers (ViT), have achieved stunning success in medical image domain. We study thorax disease classification in this paper. Effective extraction of features for the disease areas is crucial for disease classification on radiographic images. While various neural architectures and training techniques, such as self-supervised learning with contrastive/restorative learning, have been employed for disease classification on radiographic images, there are no principled methods which can effectively reduce the adverse effect of noise and background, or non-disease areas, on the radiographic images for disease classification. To address this challenge, we propose a novel Low-Rank Feature Learning (LRFL) method in this paper, which is universally applicable to the training of all neural networks. The LRFL method is both empirically motivated by the low frequency property observed on all the medical datasets in this paper, and theoretically motivated by our sharp generalization bound for neural networks with low-rank features. In the empirical study, using a neural network such as a ViT or a CNN pre-trained on unlabeled chest X-rays by Masked Autoencoders (MAE), our novel LRFL method is applied on the pre-trained neural network and demonstrate better classification results in terms of both multiclass area under the receiver operating curve (mAUC) and classification accuracy.

Learning Low-Rank Feature for Thorax Disease Classification

TL;DR

The paper tackles thorax disease classification from chest X-rays by introducing Low-Rank Feature Learning (LRFL), a regularization framework that promotes low-rank, disease-focused features via a truncated nuclear-norm term appended to a linear classifier trained on MAE-pretrained backbones. It provides a theoretical sharp generalization bound for LRFL and a separable SGD-friendly approximation to optimize the regularizer, enabling practical use with CNNs and ViTs. Empirically, LRFL yields state-of-the-art or competitive results on NIH ChestX-ray14, COVIDx, and CheXpert, with notable gains in low-data regimes and improved localization of disease regions as shown by Grad-CAM. The approach is efficient, broadly applicable to various network architectures, and has practical impact for robust radiographic disease classification under noise and background variation.

Abstract

Deep neural networks, including Convolutional Neural Networks (CNNs) and Visual Transformers (ViT), have achieved stunning success in medical image domain. We study thorax disease classification in this paper. Effective extraction of features for the disease areas is crucial for disease classification on radiographic images. While various neural architectures and training techniques, such as self-supervised learning with contrastive/restorative learning, have been employed for disease classification on radiographic images, there are no principled methods which can effectively reduce the adverse effect of noise and background, or non-disease areas, on the radiographic images for disease classification. To address this challenge, we propose a novel Low-Rank Feature Learning (LRFL) method in this paper, which is universally applicable to the training of all neural networks. The LRFL method is both empirically motivated by the low frequency property observed on all the medical datasets in this paper, and theoretically motivated by our sharp generalization bound for neural networks with low-rank features. In the empirical study, using a neural network such as a ViT or a CNN pre-trained on unlabeled chest X-rays by Masked Autoencoders (MAE), our novel LRFL method is applied on the pre-trained neural network and demonstrate better classification results in terms of both multiclass area under the receiver operating curve (mAUC) and classification accuracy.
Paper Structure (24 sections, 1 theorem, 3 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 24 sections, 1 theorem, 3 equations, 6 figures, 8 tables, 1 algorithm.

Key Result

Theorem 3.1

For every $x > 0$, with probability at least $1-\exp(-x)$, after the $t$-th iteration of gradient descent for all $t \ge 1$, we have L_ D(NN_$\mathbf{W}$) |$\mathbf{Y}$ - $\mathbf{Y}$|_F + c_1 ( 1-η λ_r )^2t |$\mathbf{Y}$|_F^2 + c_2 _h ∈ [0,r] ( hn + √1n ∑_i=h+1^r λ_i ) + c_3 xn, where $c_1,c_2,c_

Figures (6)

  • Figure 1: Training Pipeline for Thorax Disease Classification.
  • Figure 2: Eigen-projection (first column) and signal concentration ratio (second column) of Vit-Base/16 on NiH-ChestXray-14, COVIDx, and CheXpert. To compute the eigen-projection, we first calculate the eigenvectors $\mathbf{U}$ of the kernel gram matrix $\mathbf{K} \in \mathbb{R}^{n \times n}$ computed by a feature matrix $\mathbf{F} \in \mathbb{R}^{n \times d}$, then the projection value is computed by $\mathbf{p} = \frac{1}{C}\sum_{c=1}^{C} {\left\|{\mathbf{U}}^{\top} \mathbf{Y}^{(c)}\right\|}_{2}^2/ {\left\| \mathbf{Y}^{(c)}\right\|}_{2}^2 \in \mathbb{R}^n$, where $C$ is the number of classes, and $\mathbf{Y}\in\{0,1\}^{n \times C}$ is the one-hot labels of all the training data, $\mathbf{Y}^{(c)}$ is the $c$-th column of $\mathbf{Y}$. The eigen-projection $\mathbf{p}_{r}$ for $r \in [\min(n,d)]$ reflects the amount of the signal projected onto the $r$-th eigenvector of $\mathbf{K}$, and the signal concentration ratio of a rank $r$ reflects the proportion of signal projected onto the top $r$ eigenvectors of $\mathbf{K}$. The signal concentration ratio for rank $r$ is computed by ${\left\|\mathbf{p}^{(1:r)}\right\|}_{2}$, where $\mathbf{p}^{(1:r)}$ contains the first $r$ elements of $\mathbf{p}$. For example, by the rank $r=38$, the signal concentration ratio of $\mathbf{Y}$ on NIH ChestX-ray14, COVIDx, and CheXpert are $0.959$, $0.964$, and $0.962$ respectively.
  • Figure 3: Grad-CAM visualization results on NIH ChestX-ray 14. The figures in the first row are the visualization results of ViT-Base, and the figures in the second row are the visualization results of Low-Rank ViT-Base. Ground-truth bounding box for each disease is shown in green. Although both the base model and its corresponding low-rank model predict the correct disease label, the low-rank model pays more attention to the disease location than the base model. More Grad-CAM visualization results are deferred to Figure \ref{['fig:sup_grad-cam']} of the supplementary.
  • Figure 4: Eigenvalues comparison between ViT-B-LR and ViT-B on ChestX-ray14, COVIDx, and CheXpert.
  • Figure 5: The relationship between mAUC and rank $T$ of ViT-B-LR on ChestX-ray14.
  • ...and 1 more figures

Theorems & Definitions (3)

  • Theorem 3.1
  • Remark 3.2
  • proof : Proof of Theorem \ref{['theorem:optimization-linear-kernel']}