Type-II/III DCT/DST algorithms with reduced number of arithmetic operations
Xuancheng Shao, Steven G. Johnson
TL;DR
The paper addresses reducing the arithmetic complexity of the type-II/III DCT and DST for power-of-two sizes by deriving a new exact flop count through pruning a recently improved FFT based on recursive rescaling. It shows that the DCT-II can be realized as a DFT of length $4N$ with real-even inputs and zero interleaving, enabling a faster implementation; the new approach reduces the asymptotic cost from roughly $2N\log_{2}N$ to about $\frac{17}{9}N\log_{2}N$ real operations. A scaling technique saving $N$ multiplications extends Arai's eight-multiplication rule to all sizes. The improvements extend to DCT-III and DST-II/III via network transposition and symmetry, with normalization considerations and practical implications discussed.
Abstract
We present algorithms for the discrete cosine transform (DCT) and discrete sine transform (DST), of types II and III, that achieve a lower count of real multiplications and additions than previously published algorithms, without sacrificing numerical accuracy. Asymptotically, the operation count is reduced from ~ 2N log_2 N to ~ (17/9) N log_2 N for a power-of-two transform size N. Furthermore, we show that a further N multiplications may be saved by a certain rescaling of the inputs or outputs, generalizing a well-known technique for N=8 by Arai et al. These results are derived by considering the DCT to be a special case of a DFT of length 4N, with certain symmetries, and then pruning redundant operations from a recent improved fast Fourier transform algorithm (based on a recursive rescaling of the conjugate-pair split radix algorithm). The improved algorithms for DCT-III, DST-II, and DST-III follow immediately from the improved count for the DCT-II.
