Table of Contents
Fetching ...

A Low-Latency FFT-IFFT Cascade Architecture

Keshab K. Parhi

TL;DR

The paper tackles the latency and area penalties of partly-parallel FFT-IFFT cascades by introducing ASAP scheduling with a uniquely designed IFFT folding set, enabling a bufferless cascade and preserving hardware footprint. The method extends to interleaved multi-channel processing, achieving full resource utilization without extra reorder or interleaving hardware. Quantitative results show memory and latency reductions of about $N/2$ elements and $N/4$ clock cycles for single-channel, and about $N/2$ elements and $N/2$ clock cycles for two-channel cascades, with throughput fixed at 2 samples per clock. This approach provides a scalable, hardware-efficient solution for real-time FFT-based processing in communications, imaging, and ML feature extraction.

Abstract

This paper addresses the design of a partly-parallel cascaded FFT-IFFT architecture that does not require any intermediate buffer. Folding can be used to design partly-parallel architectures for FFT and IFFT. While many cascaded FFT-IFFT architectures can be designed using various folding sets for the FFT and the IFFT, for a specified folded FFT architecture, there exists a unique folding set to design the IFFT architecture that does not require an intermediate buffer. Such a folding set is designed by processing the output of the FFT as soon as possible (ASAP) in the folded IFFT. Elimination of the intermediate buffer reduces latency and saves area. The proposed approach is also extended to interleaved processing of multi-channel time-series. The proposed FFT-IFFT cascade architecture saves about N/2 memory elements and N/4 clock cycles of latency compared to a design with identical folding sets. For the 2-interleaved FFT-IFFT cascade, the memory and latency savings are, respectively, N/2 units and N/2 clock cycles, compared to a design with identical folding sets.

A Low-Latency FFT-IFFT Cascade Architecture

TL;DR

The paper tackles the latency and area penalties of partly-parallel FFT-IFFT cascades by introducing ASAP scheduling with a uniquely designed IFFT folding set, enabling a bufferless cascade and preserving hardware footprint. The method extends to interleaved multi-channel processing, achieving full resource utilization without extra reorder or interleaving hardware. Quantitative results show memory and latency reductions of about elements and clock cycles for single-channel, and about elements and clock cycles for two-channel cascades, with throughput fixed at 2 samples per clock. This approach provides a scalable, hardware-efficient solution for real-time FFT-based processing in communications, imaging, and ML feature extraction.

Abstract

This paper addresses the design of a partly-parallel cascaded FFT-IFFT architecture that does not require any intermediate buffer. Folding can be used to design partly-parallel architectures for FFT and IFFT. While many cascaded FFT-IFFT architectures can be designed using various folding sets for the FFT and the IFFT, for a specified folded FFT architecture, there exists a unique folding set to design the IFFT architecture that does not require an intermediate buffer. Such a folding set is designed by processing the output of the FFT as soon as possible (ASAP) in the folded IFFT. Elimination of the intermediate buffer reduces latency and saves area. The proposed approach is also extended to interleaved processing of multi-channel time-series. The proposed FFT-IFFT cascade architecture saves about N/2 memory elements and N/4 clock cycles of latency compared to a design with identical folding sets. For the 2-interleaved FFT-IFFT cascade, the memory and latency savings are, respectively, N/2 units and N/2 clock cycles, compared to a design with identical folding sets.
Paper Structure (8 sections, 6 equations, 5 figures, 1 table)

This paper contains 8 sections, 6 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Cascaded FFT-IFFT architecture with and without intermediate buffer.
  • Figure 2: Data-flow graphs for FFT and IFFT with scheduling. Clock cyles are marked in red.
  • Figure 3: Cascaded 16-Point FFT-IFFT architectures. Top-Middle cascade represents a traditional design. Top-bottom cascade represents the proposed design.
  • Figure 4: Data-flow graphs and schedules for Interleaved FFT and IFFT.
  • Figure 5: Cascaded interleaved 16-Point FFT-IFFT architectures. Top-Middle cascade represents a traditional design. Top-bottom cascade represents the proposed design.