Table of Contents
Fetching ...

ARCS: Autoregressive Circuit Synthesis with Topology-Aware Graph Attention and Spec Conditioning

Tushar Dhananjay Pathak

Abstract

This paper presents ARCS (Autoregressive Circuit Synthesis), a system for amortized analog circuit generation that produces complete, SPICE-simulatable designs (topology and component values) in milliseconds rather than the minutes required by search-based methods. A hybrid pipeline combining two learned generators (a graph VAE and a flow-matching model) with SPICE-based ranking achieves 99.9% simulation validity (reward 6.43/8.0) across 32 topologies using only 8 SPICE evaluations, 40x fewer than genetic algorithms. For single-model inference, a topology-aware Graph Transformer with Best-of-3 candidate selection reaches 85% simulation validity in 97ms, over 600x faster than random search. The key technical contribution adapts Group Relative Policy Optimization (GRPO) to multi-topology circuit reinforcement learning, resolving a critical failure mode of REINFORCE (cross-topology reward distribution mismatch) through per-topology advantage normalization. This improves simulation validity by +9.6 percentage points over REINFORCE in only 500 RL steps (10x fewer). Grammar-constrained decoding additionally guarantees 100% structural validity by construction via topology-aware token masking.

ARCS: Autoregressive Circuit Synthesis with Topology-Aware Graph Attention and Spec Conditioning

Abstract

This paper presents ARCS (Autoregressive Circuit Synthesis), a system for amortized analog circuit generation that produces complete, SPICE-simulatable designs (topology and component values) in milliseconds rather than the minutes required by search-based methods. A hybrid pipeline combining two learned generators (a graph VAE and a flow-matching model) with SPICE-based ranking achieves 99.9% simulation validity (reward 6.43/8.0) across 32 topologies using only 8 SPICE evaluations, 40x fewer than genetic algorithms. For single-model inference, a topology-aware Graph Transformer with Best-of-3 candidate selection reaches 85% simulation validity in 97ms, over 600x faster than random search. The key technical contribution adapts Group Relative Policy Optimization (GRPO) to multi-topology circuit reinforcement learning, resolving a critical failure mode of REINFORCE (cross-topology reward distribution mismatch) through per-topology advantage normalization. This improves simulation validity by +9.6 percentage points over REINFORCE in only 500 RL steps (10x fewer). Grammar-constrained decoding additionally guarantees 100% structural validity by construction via topology-aware token masking.

Paper Structure

This paper contains 31 sections, 11 equations, 6 figures, 11 tables.

Figures (6)

  • Figure 1: ARCS system overview. Top: Training pipeline. SPICE templates generate data, supervised pre-training learns the sequence distribution, and GRPO with SPICE-in-the-loop per-topology advantages refines value quality. Bottom: Inference pipeline. A target specification is tokenized, the trained model autoregressively generates component tokens with grammar-constrained masking, producing a valid circuit in $\sim$20 ms.
  • Figure 2: Example ARCS-generated buck converter. Top: Token sequence with spec values and component values. Bottom: Decoded SPICE netlist fragment. The model generates the full sequence in $\sim$20 ms; grammar constraints ensure structural validity at each step.
  • Figure 3: Grammar state machine for constrained decoding. At each autoregressive step, the current state determines which token types are valid, and a mask zeroes out all others before sampling. Topology-level constraints further restrict the allowed component types; Full constraints additionally restrict value ranges.
  • Figure 4: Simulation and structural validity across model variants (mean of 5 seeds). GRPO achieves the best sim-validity among autoregressive methods ($+9.6$ pp over REINFORCE). Grammar-constrained decoding (GT+Constr) achieves 100% on both metrics.
  • Figure 5: RL training dynamics (in-training validation; Table \ref{['tab:main']} reports held-out results). REINFORCE fluctuates around the SL baseline (dashed). GRPO steadily improves; the best checkpoint (step 500, 53.1% held-out) was selected via early stopping.
  • ...and 1 more figures