Table of Contents
Fetching ...

Self-Paced Learning for Images of Antinuclear Antibodies

Yiyang Jiang, Guangwu Qian, Jiaxin Wu, Qi Huang, Qing Li, Yongkang Wu, Xiao-Yong Wei

TL;DR

The paper tackles ANA detection in real-world clinical settings, reframing it as a multi-instance multi-label (MIML) problem on unaltered microscope images. It introduces a three-component framework—Instance Sampler, Probabilistic Pseudo-label Dispatcher, and Self-Paced Loss—built around per-instance weights to identify informative sub-regions and to supervise instance-level learning without instance-level ground truth. Empirically, the method achieves state-of-the-art results on a real ANA dataset (notably +$7.0\%$ F1$\text{Macro}$ and +$12.6\%$ mAP) and shows strong performance across three public MIML medical benchmarks, with substantial reductions in Hamming loss and One Error. The approach demonstrates strong generalization to diverse medical imaging tasks and is available as open-source, highlighting practical impact for clinical automation and AI-assisted diagnosis.

Abstract

Antinuclear antibody (ANA) testing is a crucial method for diagnosing autoimmune disorders, including lupus, Sjögren's syndrome, and scleroderma. Despite its importance, manual ANA detection is slow, labor-intensive, and demands years of training. ANA detection is complicated by over 100 coexisting antibody types, resulting in vast fluorescent pattern combinations. Although machine learning and deep learning have enabled automation, ANA detection in real-world clinical settings presents unique challenges as it involves multi-instance, multi-label (MIML) learning. In this paper, a novel framework for ANA detection is proposed that handles the complexities of MIML tasks using unaltered microscope images without manual preprocessing. Inspired by human labeling logic, it identifies consistent ANA sub-regions and assigns aggregated labels accordingly. These steps are implemented using three task-specific components: an instance sampler, a probabilistic pseudo-label dispatcher, and self-paced weight learning rate coefficients. The instance sampler suppresses low-confidence instances by modeling pattern confidence, while the dispatcher adaptively assigns labels based on instance distinguishability. Self-paced learning adjusts training according to empirical label observations. Our framework overcomes limitations of traditional MIML methods and supports end-to-end optimization. Extensive experiments on one ANA dataset and three public medical MIML benchmarks demonstrate the superiority of our framework. On the ANA dataset, our model achieves up to +7.0% F1-Macro and +12.6% mAP gains over the best prior method, setting new state-of-the-art results. It also ranks top-2 across all key metrics on public datasets, reducing Hamming loss and one-error by up to 18.2% and 26.9%, respectively. The source code can be accessed at https://github.com/fletcherjiang/ANA-SelfPacedLearning.

Self-Paced Learning for Images of Antinuclear Antibodies

TL;DR

The paper tackles ANA detection in real-world clinical settings, reframing it as a multi-instance multi-label (MIML) problem on unaltered microscope images. It introduces a three-component framework—Instance Sampler, Probabilistic Pseudo-label Dispatcher, and Self-Paced Loss—built around per-instance weights to identify informative sub-regions and to supervise instance-level learning without instance-level ground truth. Empirically, the method achieves state-of-the-art results on a real ANA dataset (notably + F1 and + mAP) and shows strong performance across three public MIML medical benchmarks, with substantial reductions in Hamming loss and One Error. The approach demonstrates strong generalization to diverse medical imaging tasks and is available as open-source, highlighting practical impact for clinical automation and AI-assisted diagnosis.

Abstract

Antinuclear antibody (ANA) testing is a crucial method for diagnosing autoimmune disorders, including lupus, Sjögren's syndrome, and scleroderma. Despite its importance, manual ANA detection is slow, labor-intensive, and demands years of training. ANA detection is complicated by over 100 coexisting antibody types, resulting in vast fluorescent pattern combinations. Although machine learning and deep learning have enabled automation, ANA detection in real-world clinical settings presents unique challenges as it involves multi-instance, multi-label (MIML) learning. In this paper, a novel framework for ANA detection is proposed that handles the complexities of MIML tasks using unaltered microscope images without manual preprocessing. Inspired by human labeling logic, it identifies consistent ANA sub-regions and assigns aggregated labels accordingly. These steps are implemented using three task-specific components: an instance sampler, a probabilistic pseudo-label dispatcher, and self-paced weight learning rate coefficients. The instance sampler suppresses low-confidence instances by modeling pattern confidence, while the dispatcher adaptively assigns labels based on instance distinguishability. Self-paced learning adjusts training according to empirical label observations. Our framework overcomes limitations of traditional MIML methods and supports end-to-end optimization. Extensive experiments on one ANA dataset and three public medical MIML benchmarks demonstrate the superiority of our framework. On the ANA dataset, our model achieves up to +7.0% F1-Macro and +12.6% mAP gains over the best prior method, setting new state-of-the-art results. It also ranks top-2 across all key metrics on public datasets, reducing Hamming loss and one-error by up to 18.2% and 26.9%, respectively. The source code can be accessed at https://github.com/fletcherjiang/ANA-SelfPacedLearning.

Paper Structure

This paper contains 25 sections, 12 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Examples of ANA images and visualization of the multi-label multi-instance challenge.
  • Figure 2: Overview of the proposed architecture, the framework begins by extracting instances from an input ANA image, different symbols (▲, ●, ✯, ◆) represent different ANA patterns detected in sub-regions of the ANA image. Each cell in the grid represents a local patch containing one or more ANA patterns. These instances are then evaluated by an Instance Sampler, which leverages learnable confidence weights to select the most representative instances for training. A CNN model provides initial predictions on the chosen instances, and a Pseudo-Label Dispatcher generates soft, continuous pseudo-labels based on both the image-level labels and the learnable confidence weights. The Dynamic Self-Paced Loss uses these pseudo-labels to adaptively emphasize more reliable instances and guide the network’s learning process. Finally, all subregion predictions are aggregated for a new image to produce the final multi-label ANA classification result.
  • Figure 3: Four different learning frameworks
  • Figure 4: (a) Training loss curves comparing the model with and without the instance sampler. (b) Performance results under different sampling strategies: Random, Uniform, and OHEM.
  • Figure 5: Grad-CAM visualization for different patch sizes. Patch size of $448\times448$ demonstrates the most focused activation on foreground ANA patterns.