Table of Contents
Fetching ...

SMILENet: Unleashing Extra-Large Capacity Image Steganography via a Synergistic Mosaic InvertibLE Hiding Network

Jun-Jie Huang, Zihan Chen, Tianrui Liu, Wentao Zhao, Xin Deng, Xinwang Liu, Meng Wang, Pier Luigi Dragotti

TL;DR

SMILENet introduces a synergistic mosaic invertible hiding network to push image steganography beyond traditional capacity limits. By combining a non-reversible Secret Information Selection with reversible ICDM/IMSE modules, it forms a Mosaic Secret Representation that minimizes interference among many secret images while enabling accurate recovery. A new Capacity-Distortion Trade-off metric provides a principled evaluation of capacity versus distortion across varying secret-image counts. The approach achieves up to 25 hidden images with high fidelity and demonstrates strong anti-steganalysis performance, marking a significant advance in practical, high-capacity steganography with manageable complexity.

Abstract

Existing image steganography methods face fundamental limitations in hiding capacity (typically $1\sim7$ images) due to severe information interference and uncoordinated capacity-distortion trade-off. We propose SMILENet, a novel synergistic framework that achieves 25 image hiding through three key innovations: (i) A synergistic network architecture coordinates reversible and non-reversible operations to efficiently exploit information redundancy in both secret and cover images. The reversible Invertible Cover-Driven Mosaic (ICDM) module and Invertible Mosaic Secret Embedding (IMSE) module establish cover-guided mosaic transformations and representation embedding with mathematically guaranteed invertibility for distortion-free embedding. The non-reversible Secret Information Selection (SIS) module and Secret Detail Enhancement (SDE) module implement learnable feature modulation for critical information selection and enhancement. (ii) A unified training strategy that coordinates complementary modules to achieve 3.0x higher capacity than existing methods with superior visual quality. (iii) Last but not least, we introduce a new metric to model Capacity-Distortion Trade-off for evaluating the image steganography algorithms that jointly considers hiding capacity and distortion, and provides a unified evaluation approach for accessing results with different number of secret image. Extensive experiments on DIV2K, Paris StreetView and ImageNet1K show that SMILENet outperforms state-of-the-art methods in terms of hiding capacity, recovery quality as well as security against steganalysis methods.

SMILENet: Unleashing Extra-Large Capacity Image Steganography via a Synergistic Mosaic InvertibLE Hiding Network

TL;DR

SMILENet introduces a synergistic mosaic invertible hiding network to push image steganography beyond traditional capacity limits. By combining a non-reversible Secret Information Selection with reversible ICDM/IMSE modules, it forms a Mosaic Secret Representation that minimizes interference among many secret images while enabling accurate recovery. A new Capacity-Distortion Trade-off metric provides a principled evaluation of capacity versus distortion across varying secret-image counts. The approach achieves up to 25 hidden images with high fidelity and demonstrates strong anti-steganalysis performance, marking a significant advance in practical, high-capacity steganography with manageable complexity.

Abstract

Existing image steganography methods face fundamental limitations in hiding capacity (typically images) due to severe information interference and uncoordinated capacity-distortion trade-off. We propose SMILENet, a novel synergistic framework that achieves 25 image hiding through three key innovations: (i) A synergistic network architecture coordinates reversible and non-reversible operations to efficiently exploit information redundancy in both secret and cover images. The reversible Invertible Cover-Driven Mosaic (ICDM) module and Invertible Mosaic Secret Embedding (IMSE) module establish cover-guided mosaic transformations and representation embedding with mathematically guaranteed invertibility for distortion-free embedding. The non-reversible Secret Information Selection (SIS) module and Secret Detail Enhancement (SDE) module implement learnable feature modulation for critical information selection and enhancement. (ii) A unified training strategy that coordinates complementary modules to achieve 3.0x higher capacity than existing methods with superior visual quality. (iii) Last but not least, we introduce a new metric to model Capacity-Distortion Trade-off for evaluating the image steganography algorithms that jointly considers hiding capacity and distortion, and provides a unified evaluation approach for accessing results with different number of secret image. Extensive experiments on DIV2K, Paris StreetView and ImageNet1K show that SMILENet outperforms state-of-the-art methods in terms of hiding capacity, recovery quality as well as security against steganalysis methods.

Paper Structure

This paper contains 29 sections, 11 equations, 12 figures, 11 tables.

Figures (12)

  • Figure 1: The Capacity-Distortion Trade-off of hiding different numbers of secret images using different steganography methods, including ISN Lu2021LargecapacityIS, InvMIHNet chen2024InvMIHNet, and SMILENet. The horizontal axis of Capacity-Distortion curve is RMSE of cover/stego image pairs; the vertical axis indicates the hiding capacity of secret images evaluated in terms of the sum of mutual information between input and recovered secret images (see Eqn. \ref{['eq:CD_fun']} for more details). The results are evaluated on DIV2K agustsson2017ntire dataset.
  • Figure 2: The hiding capability of the representative image steganography methods in terms of the number of secret images.
  • Figure 3: Framework overview of the proposed SMILENet. During the hiding process, the secret images $\{ \bm{x}_{si} \}_{i=1}^N$ are first processed by an SIS module to select essential information $\{ \bm{\tilde{x}}_{si} \}_{i=1}^N$ to be concealed. Next, an ICDM module, guided by the cover image, reversibly transforms $\{ \bm{\tilde{x}}_{si} \}_{i=1}^N$ to secret representations and then spatially splices them to a Mosaic Secret Representation (MSR) $\bm{x}_{ms}$. Subsequently, an IMSE module conceals the MSR into the cover image $\bm{x}_{c}$. Its outputs include a stego image $\bm{x}_{\text{stego}}$ and an image-agnostic component $\bm{r}_{\mathcal{H}}$ which is discarded thereafter. During the recovery process, the reverse pass of the IMSE module and the ICDM module recover MSR $\bm{\hat{x}}_{ms}$ and reconstruct the pre-processed secret images $\{ \bm{\bar{x}}_{si} \}_{i=1}^N$. Finally, a Secret Detail Enhancement (SDE) module refines the recovered images by enhancing their fine details, thereby generating the final secret images $\{ \bm{\hat{x}}_{si} \}_{i=1}^N$.
  • Figure 4: The process of splicing and splitting $N=m \times n$ secret images.
  • Figure 5: Visualization results of SMILENet on hiding and recovery 25 secret images evaluated on DIV2K, Paris StreetView and ImageNet datasets. All residual images are enlarged 10 times for better perception.
  • ...and 7 more figures