PipeMFL-240K: A Large-scale Dataset and Benchmark for Object Detection in Pipeline Magnetic Flux Leakage Imaging

Tianyi Qu; Songxiao Yang; Haolin Wang; Huadong Song; Xiaoting Guo; Wenguang Hu; Guanlin Liu; Honghe Chen; Yafei Ou

PipeMFL-240K: A Large-scale Dataset and Benchmark for Object Detection in Pipeline Magnetic Flux Leakage Imaging

Tianyi Qu, Songxiao Yang, Haolin Wang, Huadong Song, Xiaoting Guo, Wenguang Hu, Guanlin Liu, Honghe Chen, Yafei Ou

TL;DR

PipeMFL-240K provides the first large-scale public dataset and benchmark for object detection in pipeline MFL imaging, addressing the lack of public data and the need to evaluate models under real-world inspection conditions. The dataset comprises 240,320 pseudo-color MFL images with 191,530 bounding boxes across 12 categories, reflecting extreme long-tail distributions, tiny targets, and strong contextual cues tied to pipe geometry and scene type. A comprehensive set of experiments across CNN, YOLO, and transformer-based detectors reveals that current methods struggle with rare and small defects, confirming substantial headroom for improvement and highlighting the importance of incorporating domain priors and multi-context modeling. The work provides a critical, reproducible foundation for robust industrial-grade MFL interpretation and maintenance planning, with data and code accessible to the community through public repositories and a DOI-backed dataset hub.

Abstract

Pipeline integrity is critical to industrial safety and environmental protection, with Magnetic Flux Leakage (MFL) detection being a primary non-destructive testing technology. Despite the promise of deep learning for automating MFL interpretation, progress toward reliable models has been constrained by the absence of a large-scale public dataset and benchmark, making fair comparison and reproducible evaluation difficult. We introduce \textbf{PipeMFL-240K}, a large-scale, meticulously annotated dataset and benchmark for complex object detection in pipeline MFL pseudo-color images. PipeMFL-240K reflects real-world inspection complexity and poses several unique challenges: (i) an extremely long-tailed distribution over \textbf{12} categories, (ii) a high prevalence of tiny objects that often comprise only a handful of pixels, and (iii) substantial intra-class variability. The dataset contains \textbf{240,320} images and \textbf{191,530} high-quality bounding-box annotations, collected from 11 pipelines spanning approximately \textbf{1,480} km. Extensive experiments are conducted with state-of-the-art object detectors to establish baselines. Results show that modern detectors still struggle with the intrinsic properties of MFL data, highlighting considerable headroom for improvement, while PipeMFL-240K provides a reliable and challenging testbed to drive future research. As the first public dataset and the first benchmark of this scale and scope for pipeline MFL inspection, it provides a critical foundation for efficient pipeline diagnostics as well as maintenance planning and is expected to accelerate algorithmic innovation and reproducible research in MFL-based pipeline integrity assessment.

PipeMFL-240K: A Large-scale Dataset and Benchmark for Object Detection in Pipeline Magnetic Flux Leakage Imaging

TL;DR

Abstract

Paper Structure (38 sections, 18 figures, 14 tables)

This paper contains 38 sections, 18 figures, 14 tables.

Introduction
Related Works
Scenario Challenges
Our Contributions
Overview of Dataset
Image and Annotation
Statistics of PipeMFL-240K
Experiments and Benchmarks
Experimental Setup
Implementation Setup
Benchmark Results
Data Scaling Study
Discussion
Challenge Analysis
Baseline Comparison
...and 23 more sections

Figures (18)

Figure 1: Feature taxonomy and annotation characteristics of the PipeMFL-240K dataset. The figure illustrates the pipeline topology and the complete label space used in the dataset. The central schematic provides the spatial context of in-line inspection, including launching, intermediate and terminal stations (LCS, ITS, and TMS) and main line (MLN), as well as different pipe types such as spiral welded pipe (SWP) and longitudinal seam welded pipe (LSWP). Surrounding panels present the annotated feature taxonomy. For each category, representative optical images and MFL signal maps are shown together with schematic circumferential signal patterns and occurrence distributions. The colored tags indicate the inspection contexts in which each category appears, highlighting the multi-context nature of the labels.
Figure 2: (A) Overall object counts for each annotated category, showing a highly long-tailed distribution across damage-type features (MTL, GWA, SWA, CRC) and component-type features (BND, SLE, BRN, TEE, CAS, VAL, ESP, FLA). (B) Distribution of object counting in images for each category, illustrating strong density imbalance, where a small subset of images contains a disproportionately large number of objects, particularly for metal loss and weld-related anomalies. (C) Relative density ratios of objects observed on LSWP versus SWP, revealing category-specific structural bias in pipe type. (D) Relative density ratios of objects appearing on main lines versus stations, highlighting pronounced contextual imbalance across inspection locations. Together, these statistics characterize the dataset as highly imbalanced in terms of category frequency, object density, and spatial context, posing significant challenges for learning robust and generalizable MFL inspection models.
Figure 3: Qualitative benchmark results on representative MFL samples. Predicted bounding boxes from different detectors are compared under the same evaluation setting.
Figure 4: Dataset scale study results on YOLOv8-m, YOLO26-m and RF-DETR-Base, illustrating performance variations in mAP50, mAP50:95, precision, recall and F1-score as the training data decreases.
Figure 5: Overview of data collection and acquisition
...and 13 more figures

PipeMFL-240K: A Large-scale Dataset and Benchmark for Object Detection in Pipeline Magnetic Flux Leakage Imaging

TL;DR

Abstract

PipeMFL-240K: A Large-scale Dataset and Benchmark for Object Detection in Pipeline Magnetic Flux Leakage Imaging

Authors

TL;DR

Abstract

Table of Contents

Figures (18)