Automated Design and Optimization of Distributed Filtering Circuits via Reinforcement Learning

Peng Gao; Tao Yu; Fei Wang; Ru-Yue Yuan

Automated Design and Optimization of Distributed Filtering Circuits via Reinforcement Learning

Peng Gao, Tao Yu, Fei Wang, Ru-Yue Yuan

TL;DR

The paper tackles the challenge of designing distributed filtering circuits (DFCs) whose performance hinges on numerous interdependent geometric parameters. It proposes an end-to-end reinforcement learning framework (RLDFCDO) that uses a PPO-based agent operating over a 2D continuous state space with nine resonator parameters per resonator and a discrete action space to iteratively optimize DFC layouts. A graph-attention–based feedforward predictor quickly estimates s_{21} to evaluate performance, while a multi-objective reward combining IOU, insertion loss, and central-frequency deviation guides learning; an invalid-action penalty further stabilizes training. Experimental results show the approach outperforms CircuitGNN and HFSS across single- and multi-bandpass tasks, achieving higher passband IOU (e.g., ~99.5%) and lower insertion loss (~1.44 dB), with substantially higher computational efficiency (CPU ~60 iterations/s, GPU ~200 iterations/s) and lower memory usage, enabling practical deployment and rapid adaptation to evolving design requirements.

Abstract

Designing distributed filter circuits (DFCs) is complex and time-consuming, involving setting and optimizing multiple hyperparameters. Traditional optimization methods, such as using the commercial finite element solver HFSS (High-Frequency Structure Simulator) to enumerate all parameter combinations with fixed steps and then simulate each combination, are not only time-consuming and labor-intensive but also rely heavily on the expertise and experience of electronics engineers, making it difficult to adapt to rapidly changing design requirements. Additionally, these commercial tools struggle with precise adjustments when parameters are sensitive to numerical changes, resulting in limited optimization effectiveness. This study proposes a novel end-to-end automated method for DFC design. The proposed method harnesses reinforcement learning (RL) algorithms, eliminating the dependence on the design experience of engineers. Thus, it significantly reduces the subjectivity and constraints associated with circuit design. The experimental findings demonstrate clear improvements in design efficiency and quality when comparing the proposed method with traditional engineer-driven methods. Furthermore, the proposed method achieves superior performance when designing complex or rapidly evolving DFCs, highlighting the substantial potential of RL in circuit design automation. In particular, compared to the existing DFC automation design method CircuitGNN, our method achieves an average performance improvement of 8.72%. Additionally, the execution efficiency of our method is 2000 times higher than CircuitGNN on the CPU and 241 times higher on the GPU.

Automated Design and Optimization of Distributed Filtering Circuits via Reinforcement Learning

TL;DR

Abstract

Paper Structure (30 sections, 3 equations, 9 figures, 7 tables, 1 algorithm)

This paper contains 30 sections, 3 equations, 9 figures, 7 tables, 1 algorithm.

Introduction
Related Works
Learning-Based Circuit Design and Optimization
RL for Sequential Decision-Making Tasks
Proposed Method
State Space
Action Space
Reward Design
Feedforward Network Design
Policy and Value Networks Design
Environment Design
Environment Creation
Environment Control
Performance Evaluation
End-to-End DFC Design Automation
...and 15 more sections

Figures (9)

Figure 1: Examples of DFC templates based on square resonators. In the figure, $d$ indicates the relative distance between two resonators, and $u$ represents the counterclockwise angle from the horizontal line to the line connecting the opening position and the center point.
Figure 2: Flowchart of end-to-end design and optimization of DFCs.
Figure 3: Overall architecture of our proposed method. The black solid lines represent the data flow, and the three different colored lines (blue, purple, and green) in the GAT represent three independent attention computations, i.e., three-head attention. The gray dashed line indicates the concatenation of the embeddings learned from the input and output resonators of the DFC, which then serves as the input for the next stage of amplitude and phase prediction.
Figure 4: Ablation study results. The central black lines in the boxes indicate the mean values. Average reward gains for 31 types of selectable action spaces across four (a) single- and (b) dual-frequency passbands. Average reward gains for the two invalid action strategies across the four (c) single- and (d) dual-frequency passbands. Average reward gains for four-step reward power variations across four (e) single- and (f) dual-frequency passbands.
Figure 5: Comparison of CDF of our method with CircuitGNN and HFSS.
...and 4 more figures

Automated Design and Optimization of Distributed Filtering Circuits via Reinforcement Learning

TL;DR

Abstract

Automated Design and Optimization of Distributed Filtering Circuits via Reinforcement Learning

Authors

TL;DR

Abstract

Table of Contents

Figures (9)