Table of Contents
Fetching ...

MIL-PF: Multiple Instance Learning on Precomputed Features for Mammography Classification

Nikola Jovišić, Milica Škipina, Nicola Dall'Asen, Dubravko Ćulibrk

TL;DR

Multiple Instance Learning on Precomputed Features (MIL-PF), a scalable framework that combines frozen foundation encoders with a lightweight MIL head for mammography classification, achieves state-of-the-art classification performance at clinical scale while substantially reducing training complexity.

Abstract

Modern foundation models provide highly expressive visual representations, yet adapting them to high-resolution medical imaging remains challenging due to limited annotations and weak supervision. Mammography, in particular, is characterized by large images, variable multi-view studies and predominantly breast-level labels, making end-to-end fine-tuning computationally expensive and often impractical. We propose Multiple Instance Learning on Precomputed Features (MIL-PF), a scalable framework that combines frozen foundation encoders with a lightweight MIL head for mammography classification. By precomputing the semantic representations and training only a small task-specific aggregation module (40k parameters), the method enables efficient experimentation and adaptation without retraining large backbones. The architecture explicitly models the global tissue context and the sparse local lesion signals through attention-based aggregation. MIL-PF achieves state-of-the-art classification performance at clinical scale while substantially reducing training complexity. We release the code for full reproducibility.

MIL-PF: Multiple Instance Learning on Precomputed Features for Mammography Classification

TL;DR

Multiple Instance Learning on Precomputed Features (MIL-PF), a scalable framework that combines frozen foundation encoders with a lightweight MIL head for mammography classification, achieves state-of-the-art classification performance at clinical scale while substantially reducing training complexity.

Abstract

Modern foundation models provide highly expressive visual representations, yet adapting them to high-resolution medical imaging remains challenging due to limited annotations and weak supervision. Mammography, in particular, is characterized by large images, variable multi-view studies and predominantly breast-level labels, making end-to-end fine-tuning computationally expensive and often impractical. We propose Multiple Instance Learning on Precomputed Features (MIL-PF), a scalable framework that combines frozen foundation encoders with a lightweight MIL head for mammography classification. By precomputing the semantic representations and training only a small task-specific aggregation module (40k parameters), the method enables efficient experimentation and adaptation without retraining large backbones. The architecture explicitly models the global tissue context and the sparse local lesion signals through attention-based aggregation. MIL-PF achieves state-of-the-art classification performance at clinical scale while substantially reducing training complexity. We release the code for full reproducibility.
Paper Structure (18 sections, 5 equations, 2 figures, 4 tables)

This paper contains 18 sections, 5 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: The MIL-PF pipeline diagram, divided into distinct stages: Feature Precomputing producing Embeddings dataset $\mathcal{E}$ and Head Training using them.
  • Figure 2: Attention maps for masses and calcifications obtained by overlapping the MIL-PF's local stream window. Ground truth is shown in green and predictions in red.