Automatic Labels are as Effective as Manual Labels in Biomedical Images Classification with Deep Learning

Niccolò Marini; Stefano Marchesin; Lluis Borras Ferris; Simon Püttmann; Marek Wodzinski; Riccardo Fratti; Damian Podareanu; Alessandro Caputo; Svetla Boytcheva; Simona Vatrano; Filippo Fraggetta; Iris Nagtegaal; Gianmaria Silvello; Manfredo Atzori; Henning Müller

Automatic Labels are as Effective as Manual Labels in Biomedical Images Classification with Deep Learning

Niccolò Marini, Stefano Marchesin, Lluis Borras Ferris, Simon Püttmann, Marek Wodzinski, Riccardo Fratti, Damian Podareanu, Alessandro Caputo, Svetla Boytcheva, Simona Vatrano, Filippo Fraggetta, Iris Nagtegaal, Gianmaria Silvello, Manfredo Atzori, Henning Müller

TL;DR

This study investigates whether automatic weak labels can match manual labels for deep learning-based whole-slide image classification in histopathology. By evaluating CLAM, transMIL, and Vision Transformer backbones on over 10,000 WSIs from celiac disease, lung cancer, and colon cancer, the authors identify practical noise thresholds where automatic labeling remains effective, especially when using SKET to extract concepts from reports. The findings show that SKET-generated labels yield performance comparable to manual labels with only 2–5% noise and that 10% (binary) or 20% (multiclass/multilabel) noise thresholds can guide adoption in real-world settings, dramatically reducing labeling time. Overall, automatic labeling—particularly via SKET—enables scalable, robust WSI classification with substantial time savings, suggesting broad applicability in clinical data analysis.

Abstract

The increasing availability of biomedical data is helping to design more robust deep learning (DL) algorithms to analyze biomedical samples. Currently, one of the main limitations to train DL algorithms to perform a specific task is the need for medical experts to label data. Automatic methods to label data exist, however automatic labels can be noisy and it is not completely clear when automatic labels can be adopted to train DL models. This paper aims to investigate under which circumstances automatic labels can be adopted to train a DL model on the classification of Whole Slide Images (WSI). The analysis involves multiple architectures, such as Convolutional Neural Networks (CNN) and Vision Transformer (ViT), and over 10000 WSIs, collected from three use cases: celiac disease, lung cancer and colon cancer, which one including respectively binary, multiclass and multilabel data. The results allow identifying 10% as the percentage of noisy labels that lead to train competitive models for the classification of WSIs. Therefore, an algorithm generating automatic labels needs to fit this criterion to be adopted. The application of the Semantic Knowledge Extractor Tool (SKET) algorithm to generate automatic labels leads to performance comparable to the one obtained with manual labels, since it generates a percentage of noisy labels between 2-5%. Automatic labels are as effective as manual ones, reaching solid performance comparable to the one obtained training models with manual labels.

Automatic Labels are as Effective as Manual Labels in Biomedical Images Classification with Deep Learning

TL;DR

Abstract

Paper Structure (28 sections, 2 figures, 10 tables)

This paper contains 28 sections, 2 figures, 10 tables.

Introduction
Background
Contribution
Materials and Methods
Dataset composition
Data analysis pipeline
Computer vision architectures
CLAM
transMIL
Vision Transformer
Semantic Knowledge Extractor Tool (SKET)
Experimental Setup
Image pre-processing
Report pre-processing
Architecture pre-training
...and 13 more sections

Figures (2)

Figure 1: Overview of the tissue use cases analyzed in the paper. The upper line includes examples of duodenal tissue samples, related to celiac disease. The central line includes examples of lung tissue samples. The bottom line includes examples of colon samples.
Figure 2: Overview of the data analysis pipeline proposed in the paper. It includes two steps. The first step (A) involves the analysis of textual reports, to extract meaningful concepts that can be used as weak (automatic) labels for WSIs. The second step (B) involves image analysis through computer vision algorithms, that are transparent to the user and can be exchanged, to predict the content of the images.

Automatic Labels are as Effective as Manual Labels in Biomedical Images Classification with Deep Learning

TL;DR

Abstract

Automatic Labels are as Effective as Manual Labels in Biomedical Images Classification with Deep Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (2)