Table of Contents
Fetching ...

Finding Incompatible Blocks for Reliable JPEG Steganalysis

Etienne Levecque, Jan Butora, Patrick Bas

TL;DR

A refined notion of incompatible JPEG images for a quality factor of 100 can detect the presence of steganographic schemes embedding in DCT coefficients and can derive a Likelihood Ratio Test depending on the number of compatible blocks per image to perform steganalysis.

Abstract

This article presents a refined notion of incompatible JPEG images for a quality factor of 100. It can be used to detect the presence of steganographic schemes embedding in DCT coefficients. We show that, within the JPEG pipeline, the combination of the DCT transform with the quantization function can map several distinct blocks in the pixel domain to the same block in the DCT domain. However, not every DCT block can be obtained: we call those blocks incompatible. In particular, incompatibility can happen when DCT coefficients are manually modified to embed a message. We show that the problem of distinguishing compatible blocks from incompatible ones is an inverse problem with or without solution and we propose two different methods to solve it. The first one is heuristic-based, fast to find a solution if it exists. The second is formulated as an Integer Linear Programming problem and can detect incompatible blocks only for a specific DCT transform in a reasonable amount of time. We show that the probability for a block to become incompatible only relies on the number of modifications. Finally, using the heuristic algorithm we can derive a Likelihood Ratio Test depending on the number of compatible blocks per image to perform steganalysis. We simulate the result of this test and show that it outperforms a deep learning detector e-SRNet for every payload between 0.001 and 0.01 bpp by using only 10% of the blocks from 256x256 images. A Selection-Channel-Aware version of the test is even more powerful and outperforms e-SRNet while using only 1% of the blocks.

Finding Incompatible Blocks for Reliable JPEG Steganalysis

TL;DR

A refined notion of incompatible JPEG images for a quality factor of 100 can detect the presence of steganographic schemes embedding in DCT coefficients and can derive a Likelihood Ratio Test depending on the number of compatible blocks per image to perform steganalysis.

Abstract

This article presents a refined notion of incompatible JPEG images for a quality factor of 100. It can be used to detect the presence of steganographic schemes embedding in DCT coefficients. We show that, within the JPEG pipeline, the combination of the DCT transform with the quantization function can map several distinct blocks in the pixel domain to the same block in the DCT domain. However, not every DCT block can be obtained: we call those blocks incompatible. In particular, incompatibility can happen when DCT coefficients are manually modified to embed a message. We show that the problem of distinguishing compatible blocks from incompatible ones is an inverse problem with or without solution and we propose two different methods to solve it. The first one is heuristic-based, fast to find a solution if it exists. The second is formulated as an Integer Linear Programming problem and can detect incompatible blocks only for a specific DCT transform in a reasonable amount of time. We show that the probability for a block to become incompatible only relies on the number of modifications. Finally, using the heuristic algorithm we can derive a Likelihood Ratio Test depending on the number of compatible blocks per image to perform steganalysis. We simulate the result of this test and show that it outperforms a deep learning detector e-SRNet for every payload between 0.001 and 0.01 bpp by using only 10% of the blocks from 256x256 images. A Selection-Channel-Aware version of the test is even more powerful and outperforms e-SRNet while using only 1% of the blocks.
Paper Structure (21 sections, 18 equations, 13 figures, 1 table, 1 algorithm)

This paper contains 21 sections, 18 equations, 13 figures, 1 table, 1 algorithm.

Figures (13)

  • Figure 1: Toy illustration of a 2D JPEG compression illustrating the incompatibility attack. Every block of $1\times 2$ pixels (left figure) is mapped to a point in the DCT space (right figure) using a 2-point DCT algorithm \ref{['eq:dct']}. We can see that the compression is not surjective ,meaning that some blocks in the DCT space (represented by the holes) do not have any antecedent in the pixel domain. On the other hand, some pixel blocks are mapped to the same DCT block during compression. In standard JPEG blocks with 64 dimensions, we observe the same property. The attack exploits the fact that during embedding, some incompatible blocks are created.
  • Figure 2: Framework of our method for finding incompatible blocks of a JPEG compressor. Red contours highlight the materials available to Eve to perform the steganalysis study.
  • Figure 3: JPEG pipeline illustrating the notations. The separation unknown/known refers to what the steganalyzer knows or does not.
  • Figure 4: Toy illustration in 2D. Starting from a random block of $1\times 2$ pixels compressed with the 2-point DCT \ref{['eq:dct']}, we extract $\mathbf{e}$ the rounding error in the spatial domain and $\mathbf{e'}$ the rounding error after embedding a message in the DCT coefficients. The dotted contours represent areas where an integer coordinate can be a solution to problem \ref{['eq:inverse_prob']}. If no integer coordinate falls into this rectangle - for example the orange rectangle on the left plot - it means that there is no solution to the inverse problem and therefore the block is incompatible. On the left, the quantization values are $[1,1]$ (2D equivalent of QF100), and on the right $[1,2]$ (2D equivalent of a QF lower than 100). The red dot is the compression error $\mathbf{k}$.
  • Figure 5: Empirical probability for a block to become incompatible after a given number of $\pm1$ modifications on DCT coefficients. The dashed cyan curve corresponds to blocks proven incompatible using the ILP approach after 3B iterations, it is a lower bound (LB) of the real value. The plain cyan curve shows incompatible and unsolved blocks using the ILP approach again, it is an upper bound (UB) of the real value. Finally, the red and yellow curves show unsolved blocks: another upper bound for 50k iterations of the heuristic approach for two different compressors. Cyan used 100 blocks per point and red and yellow used 1000 blocks per point.
  • ...and 8 more figures