Table of Contents
Fetching ...

Explainable Artifacts for Synthetic Western Blot Source Attribution

João Phillipe Cardenuto, Sara Mandelli, Daniel Moreira, Paolo Bestagini, Edward Delp, Anderson Rocha

TL;DR

The paper tackles the problem of detecting AI-generated Western blot images and attributing their source models to counter paper mills. It introduces explainable artifacts derived from residual noise, patch-based Fourier analysis (PATCH-FFT-PEAKS), and Fourier-based texture features (FFT-GLCM), combined with selective residual-noise extraction methods to detect synthetic content and identify generator models. Through closed-set, open-set, and one-vs-rest experiments on authentic and synthetic Western blots, the approach demonstrates strong performance for attribution using hand-crafted features, with open-set results showing robustness where deep-learning features lag. The work provides practical forensic tools for provenance analysis in biomedical imagery and suggests directions for extending attribution to additional models and image types, aiding scientific integrity and anti-mills efforts.

Abstract

Recent advancements in artificial intelligence have enabled generative models to produce synthetic scientific images that are indistinguishable from pristine ones, posing a challenge even for expert scientists habituated to working with such content. When exploited by organizations known as paper mills, which systematically generate fraudulent articles, these technologies can significantly contribute to the spread of misinformation about ungrounded science, potentially undermining trust in scientific research. While previous studies have explored black-box solutions, such as Convolutional Neural Networks, for identifying synthetic content, only some have addressed the challenge of generalizing across different models and providing insight into the artifacts in synthetic images that inform the detection process. This study aims to identify explainable artifacts generated by state-of-the-art generative models (e.g., Generative Adversarial Networks and Diffusion Models) and leverage them for open-set identification and source attribution (i.e., pointing to the model that created the image).

Explainable Artifacts for Synthetic Western Blot Source Attribution

TL;DR

The paper tackles the problem of detecting AI-generated Western blot images and attributing their source models to counter paper mills. It introduces explainable artifacts derived from residual noise, patch-based Fourier analysis (PATCH-FFT-PEAKS), and Fourier-based texture features (FFT-GLCM), combined with selective residual-noise extraction methods to detect synthetic content and identify generator models. Through closed-set, open-set, and one-vs-rest experiments on authentic and synthetic Western blots, the approach demonstrates strong performance for attribution using hand-crafted features, with open-set results showing robustness where deep-learning features lag. The work provides practical forensic tools for provenance analysis in biomedical imagery and suggests directions for extending attribution to additional models and image types, aiding scientific integrity and anti-mills efforts.

Abstract

Recent advancements in artificial intelligence have enabled generative models to produce synthetic scientific images that are indistinguishable from pristine ones, posing a challenge even for expert scientists habituated to working with such content. When exploited by organizations known as paper mills, which systematically generate fraudulent articles, these technologies can significantly contribute to the spread of misinformation about ungrounded science, potentially undermining trust in scientific research. While previous studies have explored black-box solutions, such as Convolutional Neural Networks, for identifying synthetic content, only some have addressed the challenge of generalizing across different models and providing insight into the artifacts in synthetic images that inform the detection process. This study aims to identify explainable artifacts generated by state-of-the-art generative models (e.g., Generative Adversarial Networks and Diffusion Models) and leverage them for open-set identification and source attribution (i.e., pointing to the model that created the image).
Paper Structure (18 sections, 1 equation, 6 figures, 3 tables)

This paper contains 18 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Comparison between a CycleGAN (a) and a pristine (b) Western blot image. The CycleGAN image contains checkerboard artifacts visible when zooming into the image. The highlighted Fourier spectrum peaks (see the yellow arrows) also indicate the presence of those artifacts.
  • Figure 2: Solution workflow. Given a questioned Western blot, we leverage residual noise extraction, periodic artifacts, and texture features' analysis to perform synthetic image detection and AI-generation model source attribution.
  • Figure 3: Comparison between the (a) Fourier calculated over the entire noise residual image (FFT-PEAKS strategy) and (b) average-patch Fourier spectrum (PATCH-FFT-PEAKS strategy). All spectra are centered in spatial frequencies $(0, 0)$ and are computed over zero-mean signals.
  • Figure 4: Different features extracted to expose AI generation artifacts. Each visualization results from an average of $100$ images. All spectra are centered in spatial frequencies $(0, 0)$ and are computed over zero-mean signals. The Fourier spectra on the same row are depicted over the same scale to help visual comparison. Rows show the explored telltale; columns show different generative AI models and a pristine source.
  • Figure 5: One-vs-rest source attribution balanced accuracy evaluated over different residual noise extractions. Each color bar depicts a different residual noise indicated by the figure legend.
  • ...and 1 more figures