Table of Contents
Fetching ...

Learning image representations for anomaly detection: application to discovery of histological alterations in drug development

Igor Zingman, Birgit Stierstorfer, Charlotte Lempp, Fabian Heinemann

TL;DR

This paper tackles anomaly detection in histopathology by learning domain-adapted image representations through an auxiliary tissue-classification task across species, organs, and stains, paired with a center-loss to produce compact normal representations. A tile-based system combines a CNN encoder with a one-class SVM to detect anomalies in histology tiles and aggregates tile decisions to a whole-slide score. The approach outperforms state-of-the-art AD methods and SSL baselines on NAFLD-related liver anomalies and demonstrates potential for early toxicity screening in drug development, with ablation analyses highlighting the value of the auxiliary task, center-loss, and color-mix augmentation. The work also shows that the learned representations can match or exceed specialized NAFLD quantification methods, suggesting broad applicability to preclinical safety assessment and reduction of late-stage attrition. It further provides a public dataset of healthy-tissue tiles to support reproducibility and benchmarking in histopathology anomaly detection.

Abstract

We present a system for anomaly detection in histopathological images. In histology, normal samples are usually abundant, whereas anomalous (pathological) cases are scarce or not available. Under such settings, one-class classifiers trained on healthy data can detect out-of-distribution anomalous samples. Such approaches combined with pre-trained Convolutional Neural Network (CNN) representations of images were previously employed for anomaly detection (AD). However, pre-trained off-the-shelf CNN representations may not be sensitive to abnormal conditions in tissues, while natural variations of healthy tissue may result in distant representations. To adapt representations to relevant details in healthy tissue we propose training a CNN on an auxiliary task that discriminates healthy tissue of different species, organs, and staining reagents. Almost no additional labeling workload is required, since healthy samples come automatically with aforementioned labels. During training we enforce compact image representations with a center-loss term, which further improves representations for AD. The proposed system outperforms established AD methods on a published dataset of liver anomalies. Moreover, it provided comparable results to conventional methods specifically tailored for quantification of liver anomalies. We show that our approach can be used for toxicity assessment of candidate drugs at early development stages and thereby may reduce expensive late-stage drug attrition.

Learning image representations for anomaly detection: application to discovery of histological alterations in drug development

TL;DR

This paper tackles anomaly detection in histopathology by learning domain-adapted image representations through an auxiliary tissue-classification task across species, organs, and stains, paired with a center-loss to produce compact normal representations. A tile-based system combines a CNN encoder with a one-class SVM to detect anomalies in histology tiles and aggregates tile decisions to a whole-slide score. The approach outperforms state-of-the-art AD methods and SSL baselines on NAFLD-related liver anomalies and demonstrates potential for early toxicity screening in drug development, with ablation analyses highlighting the value of the auxiliary task, center-loss, and color-mix augmentation. The work also shows that the learned representations can match or exceed specialized NAFLD quantification methods, suggesting broad applicability to preclinical safety assessment and reduction of late-stage attrition. It further provides a public dataset of healthy-tissue tiles to support reproducibility and benchmarking in histopathology anomaly detection.

Abstract

We present a system for anomaly detection in histopathological images. In histology, normal samples are usually abundant, whereas anomalous (pathological) cases are scarce or not available. Under such settings, one-class classifiers trained on healthy data can detect out-of-distribution anomalous samples. Such approaches combined with pre-trained Convolutional Neural Network (CNN) representations of images were previously employed for anomaly detection (AD). However, pre-trained off-the-shelf CNN representations may not be sensitive to abnormal conditions in tissues, while natural variations of healthy tissue may result in distant representations. To adapt representations to relevant details in healthy tissue we propose training a CNN on an auxiliary task that discriminates healthy tissue of different species, organs, and staining reagents. Almost no additional labeling workload is required, since healthy samples come automatically with aforementioned labels. During training we enforce compact image representations with a center-loss term, which further improves representations for AD. The proposed system outperforms established AD methods on a published dataset of liver anomalies. Moreover, it provided comparable results to conventional methods specifically tailored for quantification of liver anomalies. We show that our approach can be used for toxicity assessment of candidate drugs at early development stages and thereby may reduce expensive late-stage drug attrition.
Paper Structure (23 sections, 9 equations, 8 figures, 6 tables)

This paper contains 23 sections, 9 equations, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Anomaly detection approach. A: Learning image representations with an auxiliary supervised classification task on set of tissue categories. The figure on the right shows t-SNE plot of image features after the training. Clusters marked with arrows correspond to target classes for the anomaly detection task in our experiments. Each color corresponds to a particular category (a combination of specie, organ, and staining). B: Training one-class classifier on a tissue of a category of interest. C: Anomaly detection in tissue of the chosen (target) category. Anomaly score $\alpha$ can be thresholded to output a binary decision.
  • Figure 2: t-SNE visualization of feature representations of images before (A) and after (B) CNN was trained on the auxiliary task (see Sec. Sec:Auxiliary). First two markers in the legend correspond to the test data we use in experiments in the Sec. Sec:Experiments, while all the other correspond to training data (healthy tissue from different organs, species, staining) used for the auxiliary task.
  • Figure 3: Examples of Masson's Trichrome stained tissue images transformed to match color patterns of different tissue classes.
  • Figure 4: Distribution of anomaly scores generated by the AD system (output of one-class classifier) for the test dataset of tissue stained with H&E. The AD system uses 320 dimensional image representations generated by Left: pre-trained (on ImageNet) EfficientNet-B0 and by Right: EfficientNet-B0 trained with the proposed techniques on our auxiliary task (BIHN model).
  • Figure 5: ROC curve for detection of tiles with NAFLD using BIHN models trained for Masson Trichrome or H&E tissue staining. A circle on the curves correspond to the working point of the one-class classifier.
  • ...and 3 more figures