Development of a Dual-Input Neural Model for Detecting AI-Generated Imagery

Jonathan Gallagher; William Pugsley

Development of a Dual-Input Neural Model for Detecting AI-Generated Imagery

Jonathan Gallagher, William Pugsley

TL;DR

A dual-branch neural network architecture that takes both images and their Fourier frequency decomposition as inputs and achieves an accuracy of 94% on the CIFAKE dataset, which significantly outperforms classic ML methods and CNNs, achieving performance comparable to some state-of-the-art architectures.

Abstract

Over the past years, images generated by artificial intelligence have become more prevalent and more realistic. Their advent raises ethical questions relating to misinformation, artistic expression, and identity theft, among others. The crux of many of these moral questions is the difficulty in distinguishing between real and fake images. It is important to develop tools that are able to detect AI-generated images, especially when these images are too realistic-looking for the human eye to identify as fake. This paper proposes a dual-branch neural network architecture that takes both images and their Fourier frequency decomposition as inputs. We use standard CNN-based methods for both branches as described in Stuchi et al. [7], followed by fully-connected layers. Our proposed model achieves an accuracy of 94% on the CIFAKE dataset, which significantly outperforms classic ML methods and CNNs, achieving performance comparable to some state-of-the-art architectures, such as ResNet.

Development of a Dual-Input Neural Model for Detecting AI-Generated Imagery

TL;DR

Abstract

Paper Structure (9 sections, 4 equations, 6 figures, 1 table)

This paper contains 9 sections, 4 equations, 6 figures, 1 table.

Introduction
Model
Frequency Domain Branch
Spatial Domain Branch
Merged Layers
Experiment
Training
Results
Conclusion

Figures (6)

Figure 1: Real and artificially-generated images from the CIFAKE dataset.
Figure 2: An example image of a bird from CIFAR-10 and its corresponding DFT. For demonstration purposes, we take the DFT of the gray-scaled image
Figure 3: A $32\times32$ image of a bird split into four $16\times16$ sub-blocks.
Figure 4: ReLU, LReLU, and PReLU with paramater $p=0.05$ activation functions.
Figure 5: Schematic of the branched network architecture depicting: a) the frequency branch and b) the image branch, where layers which serve the same purpose are color coded.
...and 1 more figures

Development of a Dual-Input Neural Model for Detecting AI-Generated Imagery

TL;DR

Abstract

Development of a Dual-Input Neural Model for Detecting AI-Generated Imagery

Authors

TL;DR

Abstract

Table of Contents

Figures (6)