Low-complexity Image and Video Coding Based on an Approximate Discrete Tchebichef Transform
P. A. M. Oliveira, R. J. Cintra, F. M. Bayer, S. Kulasekera, A. Madanayake, V. A. Coutinho
TL;DR
The paper addresses the high computational cost of the exact discrete Tchebichef transform (DTT) for image and video coding by proposing a low-complexity, multiplierless approximation designed for real-time and low-power applications. It builds a parametric family of matrices $\mathbf{T}_N(\alpha)$, rounds them to low-complexity forms, and then derives near-orthogonal transforms $\hat{\mathbf{T}}_N(\alpha)$ through a diagonal scaling $\mathbf{S}_N(\alpha)$, selecting the optimal $\alpha$ via a multicriteria optimization on coding gain $C_g$ and transform efficiency $\eta$. The authors provide a closed-form 8-point design $\mathbf{T}_8^*$ with a fast butterfly-based algorithm achieving only $24$ additions and $6$ bit-shifts, and show that the associated inverse can be efficiently approximated. Empirical results in JPEG-like image compression and H.264/AVC video encoding demonstrate coding performance close to the exact DTT and superior to Oliveira2015’s prior approximation, while hardware implementations on FPGA and ASIC show substantial resource and area-time reductions. The work offers a practical route to deploy low-complexity, high-performance transforms in low-power, real-time image/video systems and wireless sensor networks.
Abstract
The usage of linear transformations has great relevance for data decorrelation applications, like image and video compression. In that sense, the discrete Tchebichef transform (DTT) possesses useful coding and decorrelation properties. The DTT transform kernel does not depend on the input data and fast algorithms can be developed to real time applications. However, the DTT fast algorithm presented in literature possess high computational complexity. In this work, we introduce a new low-complexity approximation for the DTT. The fast algorithm of the proposed transform is multiplication-free and requires a reduced number of additions and bit-shifting operations. Image and video compression simulations in popular standards shows good performance of the proposed transform. Regarding hardware resource consumption for FPGA shows 43.1% reduction of configurable logic blocks and ASIC place and route realization shows 57.7% reduction in the area-time figure when compared with the 2-D version of the exact DTT.
