CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture

Jonas Ney; Christoph Füllner; Vincent Lauinger; Laurent Schmalen; Sebastian Randel; Norbert Wehn

CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture

Jonas Ney, Christoph Füllner, Vincent Lauinger, Laurent Schmalen, Sebastian Randel, Norbert Wehn

TL;DR

This work tackles the challenge of achieving both very high throughput and flexibility for ANN-based equalizers in optical communications. It develops a CNN-based equalizer implemented on FPGA with cross-layer optimizations, a design-space exploration framework, and a quantization-aware training procedure, achieving BER reductions around $4\times$ and throughput above $40\ \mathrm{GBd}$; it also demonstrates applicability to a magnetic recording channel and shows FPGA-based solutions can outperform GPUs by several orders of magnitude for similar batch sizes. The contributions include a detailed topology design and a framework to optimize sequence length per instance to trade throughput against latency, plus a high-throughput hardware architecture with adjustable DOP and stream partitioning. Overall, the results indicate that CNN-based equalization on flexible FPGA hardware can meet the rigorous throughput, latency, and power constraints of beyond-5G/6G systems while offering cross-domain applicability and substantial performance advantages over traditional methods and general-purpose accelerators.

Abstract

To satisfy the growing throughput demand of data-intensive applications, the performance of optical communication systems increased dramatically in recent years. With higher throughput, more advanced equalizers are crucial, to compensate for impairments caused by inter-symbol interference (ISI). The latest research shows that artificial neural network (ANN)-based equalizers are promising candidates to replace traditional algorithms for high-throughput communications. On the other hand, not only throughput but also flexibility is a main objective of beyond-5G and 6G communication systems. A platform that is able to satisfy the strict throughput and flexibility requirements of modern communication systems are field programmable gate arrays (FPGAs). Thus, in this work, we present a high-performance FPGA implementation of an ANN-based equalizer, which meets the throughput requirements of modern optical communication systems. Further, our architecture is highly flexible since it includes a variable degree of parallelism (DOP) and therefore can also be applied to low-cost or low-power applications which is demonstrated for a magnetic recording channel. The implementation is based on a cross-layer design approach featuring optimizations from the algorithm down to the hardware architecture, including a detailed quantization analysis. Moreover, we present a framework to reduce the latency of the ANN-based equalizer under given throughput constraints. As a result, the bit error ratio (BER) of our equalizer for the optical fiber channel is around four times lower than that of a conventional one, while the corresponding FPGA implementation achieves a throughput of more than 40 GBd, outperforming a high-performance graphics processing unit (GPU) by three orders of magnitude for a similar batch size.

CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture

TL;DR

and throughput above

; it also demonstrates applicability to a magnetic recording channel and shows FPGA-based solutions can outperform GPUs by several orders of magnitude for similar batch sizes. The contributions include a detailed topology design and a framework to optimize sequence length per instance to trade throughput against latency, plus a high-throughput hardware architecture with adjustable DOP and stream partitioning. Overall, the results indicate that CNN-based equalization on flexible FPGA hardware can meet the rigorous throughput, latency, and power constraints of beyond-5G/6G systems while offering cross-domain applicability and substantial performance advantages over traditional methods and general-purpose accelerators.

Abstract

Paper Structure (28 sections, 17 equations, 14 figures, 1 table)

This paper contains 28 sections, 17 equations, 14 figures, 1 table.

Introduction
Investigated Communication Channels
Fiber-Optical Channel
Magnetic Recording Channel
CNN Design Space Exploration
CNN Topology Template
Linear Equalizer
Volterra Equalizer
Design Space Exploration Framework
Results of Design Space Exploration
Performance for the Magnetic Recording Channel
Quantization
Hardware Architecture
High-Throughput Architecture
Flexibility of Hardware Architecture
...and 13 more sections

Figures (14)

Figure 1: Topology template of the equalizer cnn. The feature map dimensions are given next to the arrows, where the first dimension corresponds to the number of channels and the second one to the width.
Figure 2: Results of design space exploration of the different equalization approaches. The maximal $\mathrm{MAC}_\mathrm{sym}$ to achieve the throughput of 40G Bd with a clock frequency of 200MHz is given by the vertical red line. The Pareto optimal models of the cnn-based equalizer, the Volterra kernel, and the fir filter are connected by the dotted, solid, and dashed lines respectively.
Figure 3: Final topology of the cnn-based equalizer with three layers, where $K$ corresponds to the kernel size, $S$ to the stride, $P$ to the padding, and $V_p$ to the symbols calculated in parallel
Figure 4: Complexity and communication performance of the selected model as compared to conventional fir filters and Volterra kernels for the magnetic recording channel
Figure 5: Course of the average activation bit width during the three phases of quantized training for different qlf.
...and 9 more figures

CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture

TL;DR

Abstract

CNN-Based Equalization for Communications: Achieving Gigabit Throughput with a Flexible FPGA Hardware Architecture

Authors

TL;DR

Abstract

Table of Contents

Figures (14)