Table of Contents
Fetching ...

Beyond Terabit/s Integrated Neuromorphic Photonic Processor for DSP-Free Optical Interconnects

Benshan Wang, Qiarong Xiao, Tengji Xu, Li Fan, Shaojie Liu, Jianji Dong, Junwen Zhang, Chaoran Huang

TL;DR

The paper introduces an integrated neuromorphic optical signal processor (OSP) that delivers DSP-free, all-optical, real-time processing for data-center interconnects. Built on silicon photonics, the OSP uses a deep time-delay reservoir with three cascaded nodes and a photonic readout to compensate linear and nonlinear impairments by learning an inverse channel response, enabling high-speed, low-latency communication. Experimentally, it achieves 100 Gbaud PAM4 per lane and 1.6 Tbps over 5 km in the C-band, with BERs below HD-FEC, and demonstrates scalable 1.6 Tbps WDM operation across eight channels, all while drastically reducing latency (≈57 ps) and energy per bit (≈0.54 fJ/bit) compared to state-of-the-art DSPs. These results indicate a highly scalable, energy-efficient optical processing paradigm capable of meeting the demands of next-generation AI infrastructure across multi-datacenter networks.

Abstract

The rapid expansion of generative AI drives unprecedented demands for high-performance computing. Training large-scale AI models now requires vast interconnected GPU clusters across multiple data centers. Multi-scale AI training and inference demand uniform, ultra-low latency, and energy-efficient links to enable massive GPUs to function as a single cohesive unit. However, traditional electrical and optical interconnects, relying on conventional digital signal processors (DSPs) for signal distortion compensation, increasingly fail to meet these stringent requirements. To overcome these limitations, we present an integrated neuromorphic optical signal processor (OSP) that leverages deep reservoir computing and achieves DSP-free, all-optical, real-time processing. Experimentally, our OSP achieves a 100 Gbaud PAM4 per lane, 1.6 Tbit/s data center interconnect over a 5 km optical fiber in the C-band (equivalent to over 80 km in the O-band), far exceeding the reach of state-of-the-art DSP solutions, which are fundamentally constrained by chromatic dispersion in IMDD systems. Simultaneously, it reduces processing latency by four orders of magnitude and energy consumption by three orders of magnitude. Unlike DSPs, which introduce increased latency at high data rates, our OSP maintains consistent, ultra-low latency regardless of data rate scaling, making it ideal for future optical interconnects. Moreover, the OSP retains full optical field information for better impairment compensation and adapts to various modulation formats, data rates, and wavelengths. Fabricated using a mature silicon photonic process, the OSP can be monolithically integrated with silicon photonic transceivers, enhancing the compactness and reliability of all-optical interconnects. This research provides a highly scalable, energy-efficient, and high-speed solution, paving the way for next-generation AI infrastructure.

Beyond Terabit/s Integrated Neuromorphic Photonic Processor for DSP-Free Optical Interconnects

TL;DR

The paper introduces an integrated neuromorphic optical signal processor (OSP) that delivers DSP-free, all-optical, real-time processing for data-center interconnects. Built on silicon photonics, the OSP uses a deep time-delay reservoir with three cascaded nodes and a photonic readout to compensate linear and nonlinear impairments by learning an inverse channel response, enabling high-speed, low-latency communication. Experimentally, it achieves 100 Gbaud PAM4 per lane and 1.6 Tbps over 5 km in the C-band, with BERs below HD-FEC, and demonstrates scalable 1.6 Tbps WDM operation across eight channels, all while drastically reducing latency (≈57 ps) and energy per bit (≈0.54 fJ/bit) compared to state-of-the-art DSPs. These results indicate a highly scalable, energy-efficient optical processing paradigm capable of meeting the demands of next-generation AI infrastructure across multi-datacenter networks.

Abstract

The rapid expansion of generative AI drives unprecedented demands for high-performance computing. Training large-scale AI models now requires vast interconnected GPU clusters across multiple data centers. Multi-scale AI training and inference demand uniform, ultra-low latency, and energy-efficient links to enable massive GPUs to function as a single cohesive unit. However, traditional electrical and optical interconnects, relying on conventional digital signal processors (DSPs) for signal distortion compensation, increasingly fail to meet these stringent requirements. To overcome these limitations, we present an integrated neuromorphic optical signal processor (OSP) that leverages deep reservoir computing and achieves DSP-free, all-optical, real-time processing. Experimentally, our OSP achieves a 100 Gbaud PAM4 per lane, 1.6 Tbit/s data center interconnect over a 5 km optical fiber in the C-band (equivalent to over 80 km in the O-band), far exceeding the reach of state-of-the-art DSP solutions, which are fundamentally constrained by chromatic dispersion in IMDD systems. Simultaneously, it reduces processing latency by four orders of magnitude and energy consumption by three orders of magnitude. Unlike DSPs, which introduce increased latency at high data rates, our OSP maintains consistent, ultra-low latency regardless of data rate scaling, making it ideal for future optical interconnects. Moreover, the OSP retains full optical field information for better impairment compensation and adapts to various modulation formats, data rates, and wavelengths. Fabricated using a mature silicon photonic process, the OSP can be monolithically integrated with silicon photonic transceivers, enhancing the compactness and reliability of all-optical interconnects. This research provides a highly scalable, energy-efficient, and high-speed solution, paving the way for next-generation AI infrastructure.

Paper Structure

This paper contains 11 sections, 6 figures.

Figures (6)

  • Figure 1: OSP architecture and implementation. (a) Multi-data center interconnection for AI/ML workloads. Optical interconnects (OIC) enabled by OSP achieve ultra-low and uniform latency, even as the scale of interconnected data centers grows. OSP overcomes the latency bottlenecks traditionally caused by DSPs, enabling accelerated AI training across increasingly large and multi-scale GPU clusters. (b) Optical interconnects using IMDD schemes with DSP-based and OSP-based receivers. One single OSP can support $M$ wavelength channels, whereas $M$ DSPs are required to handle the same number of channels. The shown photodiode (PD) represents the receiver front-end, which also includes the trans-impedance amplifier (TIA). Comp.: compensation. (c) Optical module integrating photonic components (OSP, PDs, electro-optic (EO) modulators, and laser diodes) with electrical components (ASIC, modulator driver, TIA, and control circuits). Optical transceiver and OSP are integrated on a single silicon photonic chip. This compact design enables high-performance optical signal processing in a scalable form factor. (d) Time-delay deep reservoir scheme, consisting of $N$ reservoir nodes with different delayed self-feedback lengths $\tau_l$ and feedback strength $\kappa_l$. Dynamical states of different reservoirs are combined using $W_c$ and then processed through a readout layer $W_{out}$ to generate the final output. (e) Proposed OSP architecture comprising three photonic reservoirs and eight readout channels. (f) The packaged OSP chip with electrical wire bonds and an optical fiber array. (g) Performance comparison of different reservoir numbers to maintain the hard-decision forward error correlation (HD-FEC) level for 100 Gbaud PAM4 signals over 5 km of fiber transmission in the C-band.
  • Figure 2: Comparison with published works. (a) Comparison of processing data rate per lane between our work and other optical processors argyris18SRargyris19Accesssackesyn21OEWang22JSTQEstaffoli23PRshen23Opticagooskens23SRstaffoli24JLTliu24NCliu24OFCsozos24JLT. (b) Comparison of beyond 200 Gbps/$\lambda$ C-band standard IMDD fiber transmission using single PD between our OSP and other DSP methods chan22JLTchan22OLcheng23OLhan23OLsang22JLTverbist19JLTmardoyan17JLT.
  • Figure 3: High-speed signal processing and programmability. (a) Schematic of the experimental setup and training process. TLS tunable laser source, PC polarization controller, EDFA erbium-doped fiber amplifier, VOA variable optical attenuator, AWG arbitrary waveform generator, EA electrical amplifier, SMUs source meter units, BPF band-pass filter, PD photodiode, RTO real-time oscilloscope. (b) The MSE converges as the OSP is trained using $2^{15}$ symbols of OOK and PAM4 signals at 80 Gbaud and 100 Gbaud. (c) Optimized currents of OSP’s on-chip programmable elements for different symbol rates and modulation formats. (d)(e) Measured BER as a function of symbol rates for OOK and PAM4 signal transmission, horizontal dashed lines indicate the thresholds for soft-decision and hard-decision forward error correlation (FEC). The associated eye diagrams are shown for the circled points. (f)(g) Measured BER as a function of wavelength for 100 GBaud OOK and PAM4.
  • Figure 4: Linear and nonlinear compensation. (a) Flowchart of the OSP optimization by learning the inverse complex-valued transfer function of the optical fiber and transceiver. (b)The spectral responses after detection of different situations. The dispersion-induced power fading effect, due to 5 km C-band transmission, is observed with multiple spectral nulls. With the proper optimization of our OSP, we can almost eliminate power fading caused by the fiber chromatic dispersion. (c) $Q$ factor versus launch power for 100-Gbaud PAM4 signal compensation using our OSP and other DSP methods.
  • Figure 5: 1.6T WDM DCIs implementation. (a) OSP-based 1.6T silicon photonic transceiver. (b) Measured BERs for 8 WDM channels 100 Gbaud PAM4 signals under the BtB transmission, 5 km SMF transmission without any compensation, with OSP only compensation, and with DSP only compensation. The OSP is optimized at 1550 nm (Ch 4). (c) Measured BERs for 8 WDM channels 100 Gbaud PAM4 signal under 5 km SMF transmission with hybrid OSP/DSP compensation and 5 km SMF transmission with DSP only compensation. (d) Required number of DSP taps for OSP and DSP compensation to achieve corresponding BERs in (c). (e) BER versus the number of DSP taps for OSP and DSP compensation at 1562.2 nm (Ch 8).
  • ...and 1 more figures