Table of Contents
Fetching ...

Hybrid Dealiasing of Complex Convolutions

Noel Murasko, John C. Bowman

TL;DR

This work introduces hybrid dealiasing, a versatile framework for computing dealiased convolutions with arbitrary padding ratios by blending explicit zero padding and implicit handling of zeros. The method leverages a residue-at-a-time decomposition and a recursive multidimensional strategy to optimize memory usage and data locality, enabling efficient FFT-based convolutions for complex, centered, and Hermitian data. Empirical results across 1D, 2D, and 3D cases show substantial speedups over traditional explicit padding and prior implicit approaches, particularly when exploiting smaller, FFT-friendly sizes and parallelism. The framework is implemented in FFTW++, demonstrates practical applicability to pseudospectral PDE solvers, and lays out a path toward real-data specialization and broader toolkit support.

Abstract

Efficient algorithms for computing linear convolutions based on the fast Fourier transform are developed. A hybrid approach is described that combines the conventional practice of explicit dealiasing (explicitly padding the input data with zeros) and implicit dealiasing (mathematically accounting for these zero values). The new approach generalizes implicit dealiasing to arbitrary padding ratios and includes explicit dealiasing as a special case. Unlike existing implementations of implicit dealiasing, hybrid dealiasing tailors its subtransform sizes to the convolution geometry. Multidimensional convolutions are implemented with hybrid dealiasing by decomposing them into lower-dimensional convolutions. Convolutions of complex-valued and Hermitian inputs of equal length are illustrated with pseudocode and implemented in the open-source FFTW++ library. Hybrid dealiasing is shown to outperform explicit dealiasing in one, two, and three dimensions.

Hybrid Dealiasing of Complex Convolutions

TL;DR

This work introduces hybrid dealiasing, a versatile framework for computing dealiased convolutions with arbitrary padding ratios by blending explicit zero padding and implicit handling of zeros. The method leverages a residue-at-a-time decomposition and a recursive multidimensional strategy to optimize memory usage and data locality, enabling efficient FFT-based convolutions for complex, centered, and Hermitian data. Empirical results across 1D, 2D, and 3D cases show substantial speedups over traditional explicit padding and prior implicit approaches, particularly when exploiting smaller, FFT-friendly sizes and parallelism. The framework is implemented in FFTW++, demonstrates practical applicability to pseudospectral PDE solvers, and lays out a path toward real-data specialization and broader toolkit support.

Abstract

Efficient algorithms for computing linear convolutions based on the fast Fourier transform are developed. A hybrid approach is described that combines the conventional practice of explicit dealiasing (explicitly padding the input data with zeros) and implicit dealiasing (mathematically accounting for these zero values). The new approach generalizes implicit dealiasing to arbitrary padding ratios and includes explicit dealiasing as a special case. Unlike existing implementations of implicit dealiasing, hybrid dealiasing tailors its subtransform sizes to the convolution geometry. Multidimensional convolutions are implemented with hybrid dealiasing by decomposing them into lower-dimensional convolutions. Convolutions of complex-valued and Hermitian inputs of equal length are illustrated with pseudocode and implemented in the open-source FFTW++ library. Hybrid dealiasing is shown to outperform explicit dealiasing in one, two, and three dimensions.
Paper Structure (20 sections, 29 equations, 16 figures, 16 algorithms)

This paper contains 20 sections, 29 equations, 16 figures, 16 algorithms.

Figures (16)

  • Figure 1: An illustration of hybrid padding for a one-dimensional array with $L=6$ and $M=11$. Choosing $m=4$, we have $p=2$ and $q=3$. We explicitly pad our data of length $L$ to length $pm=8$, and then implicitly pad our data to length $qm=12$.
  • Figure 1: Recursive computation of an $n$-dimensional convolution.
  • Figure 1: In-place 1D complex convolutions of length $L$ with $A=2$ and $B=1$ on 1 thread.
  • Figure 2: Accumulation of residue contributions to a convolution.
  • Figure 2: The reuse of memory to compute the contribution of a single $x$ residue to a 2D binary convolution with two inputs and one output: a 1D padded $y$ FFT is applied to columns of ${\boldsymbol{F}}_{r_x}$ and ${\boldsymbol{G}}_{r_x}$ to produce the two stacked yellow columns that are fed to the multiplication operator, producing one stacked column to be inverse $y$ transformed into a single column (like the red one shown on the left). The upper column is then reused for processing subsequent columns.
  • ...and 11 more figures