Table of Contents
Fetching ...

PhotoHolmes: a Python library for forgery detection in digital images

Julián O'Flaherty, Rodrigo Paganini, Juan Pablo Sotelo, Julieta Umpiérrez, Marina Gardella, Matías Tailanian, Pablo Musé

TL;DR

PhotoHolmes presents an open-source Python library that unifies forgery detection research by providing modular components for datasets, preprocessing, detection methods, postprocessing, metrics, benchmarking, and a command-line interface. Its core contributions include a BaseMethod framework with a registry for state-of-the-art detectors, a dataset registry with multiple benchmarks, and custom weighted metrics that enable robust evaluation for both localization and detection tasks. The benchmarking engine and CLI enable reproducible, scalable comparisons across methods and datasets, while its extensible design invites community contributions. This framework has practical impact by lowering barriers to benchmarking and testing on suspicious images, accelerating methodological development and transparent reporting in the image forensics field.

Abstract

In this paper, we introduce PhotoHolmes, an open-source Python library designed to easily run and benchmark forgery detection methods on digital images. The library includes implementations of popular and state-of-the-art methods, dataset integration tools, and evaluation metrics. Utilizing the Benchmark tool in PhotoHolmes, users can effortlessly compare various methods. This facilitates an accurate and reproducible comparison between their own methods and those in the existing literature. Furthermore, PhotoHolmes includes a command-line interface (CLI) to easily run the methods implemented in the library on any suspicious image. As such, image forgery methods become more accessible to the community. The library has been built with extensibility and modularity in mind, which makes adding new methods, datasets and metrics to the library a straightforward process. The source code is available at https://github.com/photoholmes/photoholmes.

PhotoHolmes: a Python library for forgery detection in digital images

TL;DR

PhotoHolmes presents an open-source Python library that unifies forgery detection research by providing modular components for datasets, preprocessing, detection methods, postprocessing, metrics, benchmarking, and a command-line interface. Its core contributions include a BaseMethod framework with a registry for state-of-the-art detectors, a dataset registry with multiple benchmarks, and custom weighted metrics that enable robust evaluation for both localization and detection tasks. The benchmarking engine and CLI enable reproducible, scalable comparisons across methods and datasets, while its extensible design invites community contributions. This framework has practical impact by lowering barriers to benchmarking and testing on suspicious images, accelerating methodological development and transparent reporting in the image forensics field.

Abstract

In this paper, we introduce PhotoHolmes, an open-source Python library designed to easily run and benchmark forgery detection methods on digital images. The library includes implementations of popular and state-of-the-art methods, dataset integration tools, and evaluation metrics. Utilizing the Benchmark tool in PhotoHolmes, users can effortlessly compare various methods. This facilitates an accurate and reproducible comparison between their own methods and those in the existing literature. Furthermore, PhotoHolmes includes a command-line interface (CLI) to easily run the methods implemented in the library on any suspicious image. As such, image forgery methods become more accessible to the community. The library has been built with extensibility and modularity in mind, which makes adding new methods, datasets and metrics to the library a straightforward process. The source code is available at https://github.com/photoholmes/photoholmes.

Paper Structure

This paper contains 44 sections, 10 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Results of all the methods implemented in the first version of PhotoHolmes on a satirical image of Paul McCartney drinking fernet spread on social networks (Figure \ref{['fig:sub1']}). For Splicebuster splicebuster_cozzolino, we include, the result with Gaussian-Uniform (Figure \ref{['fig:splicebustergu']}) and Gaussian-Gaussian for the EM (Figure \ref{['fig:splicebustergg']}). As for EXIF zheng2023exif, we included both the result using mean shift (Figure \ref{['fig:sub14']}) and the one with normalized cuts as clustering method (Figure \ref{['fig:sub15']}).
  • Figure 2: Benchmark class flow diagram. Everything starts by choosing a dataset and a method, then according to the chosen method, the dataset is preprocessed with the corresponding preprocessing. Then, outputs can be visualized, and chosen metrics are computed. The metrics are then stored as benchmark reports.
  • Figure 3: Output of running photoholmes run catnet <image_path> --overlay using the CLI. The forged image is the one presented in Figure \ref{['fig:outputs']}.