CLPSTNet: A Progressive Multi-Scale Convolutional Steganography Model Integrating Curriculum Learning
Fengchun Liu, Tong Zhang, Chunying Zhang
TL;DR
CLPSTNet addresses the challenge of invisibility and security in CNN-based image steganography by introducing a curriculum learning–inspired, progressive multi-scale convolutional framework. The architecture combines an Encoder, a Decoder, and a XuNet-based Critic, with a Progressive Multi-scale Convolution Block (PMCB) that fuses Inception-style multi-branch paths and dilated convolutions to capture features across scales. The loss combines embedding, recovery, and steganalysis terms to optimize both image quality and resistance to detection, while experiments on ALASKA2, VOC2012, and ImageNet demonstrate high PSNR, MSSSIM, and decoding performance at 1–6 bpp, with strong steganalysis resistance. The work shows that progressive scale growth and dense connectivity, guided by curriculum principles, yield superior steganographic quality, suggesting practical improvements for secure data hiding under real-world datasets, albeit with limitations in non-binary data embedding and recovery accuracy that warrant future decoding-focused enhancements.
Abstract
In recent years, a large number of works have introduced Convolutional Neural Networks (CNNs) into image steganography, which transform traditional steganography methods such as hand-crafted features and prior knowledge design into steganography methods that neural networks autonomically learn information embedding. However, due to the inherent complexity of digital images, issues of invisibility and security persist when using CNN models for information embedding. In this paper, we propose Curriculum Learning Progressive Steganophy Network (CLPSTNet). The network consists of multiple progressive multi-scale convolutional modules that integrate Inception structures and dilated convolutions. The module contains multiple branching pathways, starting from a smaller convolutional kernel and dilatation rate, extracting the basic, local feature information from the feature map, and gradually expanding to the convolution with a larger convolutional kernel and dilatation rate for perceiving the feature information of a larger receptive field, so as to realize the multi-scale feature extraction from shallow to deep, and from fine to coarse, allowing the shallow secret information features to be refined in different fusion stages. The experimental results show that the proposed CLPSTNet not only has high PSNR , SSIM metrics and decoding accuracy on three large public datasets, ALASKA2, VOC2012 and ImageNet, but also the steganographic images generated by CLPSTNet have low steganalysis scores.You can find our code at \href{https://github.com/chaos-boops/CLPSTNet}{https://github.com/chaos-boops/CLPSTNet}.
