Table of Contents
Fetching ...

TRACE: Learning to Compute on Circuit Graphs

Ziyang Zheng, Jiaying Zhu, Jingyi Zhou, Qiang Xu

TL;DR

TRACE reframes circuit-graph learning by aligning the backbone with the step-by-step computation of circuits through a Hierarchical Transformer and by decoupling local operator behavior from global function via function shift learning. This yields robust, position-aware representations that capture reconvergent dependencies and operator-specific input interactions better than traditional MPNNs or flat Transformers. Across RTL, AIG, and PM netlists, TRACE consistently outperforms all baselines on both contrastive and predictive tasks, demonstrating strong generalization and stability. The combination of architecture and learning objective provides a practical, scalable approach to modeling circuit functionality with potential applicability to broader computation-graph domains.

Abstract

Learning to compute, the ability to model the functional behavior of a circuit graph, is a fundamental challenge for graph representation learning. Yet, the dominant paradigm is architecturally mismatched for this task. This flawed assumption, central to mainstream message passing neural networks (MPNNs) and their conventional Transformer-based counterparts, prevents models from capturing the position-aware, hierarchical nature of computation. To resolve this, we introduce TRACE, a new paradigm built on an architecturally sound backbone and a principled learning objective. First, TRACE employs a Hierarchical Transformer that mirrors the step-by-step flow of computation, providing a faithful architectural backbone that replaces the flawed permutation-invariant aggregation. Second, we introduce function shift learning, a novel objective that decouples the learning problem. Instead of predicting the complex global function directly, our model is trained to predict only the function shift, the discrepancy between the true global function and a simple local approximation that assumes input independence. We validate this paradigm on various circuits modalities, including Register Transfer Level graphs, And-Inverter Graphs and post-mapping netlists. Across a comprehensive suite of benchmarks, TRACE substantially outperforms all prior architectures. These results demonstrate that our architecturally-aligned backbone and decoupled learning objective form a more robust paradigm for the fundamental challenge of learning the functional behavior of a circuit graph.

TRACE: Learning to Compute on Circuit Graphs

TL;DR

TRACE reframes circuit-graph learning by aligning the backbone with the step-by-step computation of circuits through a Hierarchical Transformer and by decoupling local operator behavior from global function via function shift learning. This yields robust, position-aware representations that capture reconvergent dependencies and operator-specific input interactions better than traditional MPNNs or flat Transformers. Across RTL, AIG, and PM netlists, TRACE consistently outperforms all baselines on both contrastive and predictive tasks, demonstrating strong generalization and stability. The combination of architecture and learning objective provides a practical, scalable approach to modeling circuit functionality with potential applicability to broader computation-graph domains.

Abstract

Learning to compute, the ability to model the functional behavior of a circuit graph, is a fundamental challenge for graph representation learning. Yet, the dominant paradigm is architecturally mismatched for this task. This flawed assumption, central to mainstream message passing neural networks (MPNNs) and their conventional Transformer-based counterparts, prevents models from capturing the position-aware, hierarchical nature of computation. To resolve this, we introduce TRACE, a new paradigm built on an architecturally sound backbone and a principled learning objective. First, TRACE employs a Hierarchical Transformer that mirrors the step-by-step flow of computation, providing a faithful architectural backbone that replaces the flawed permutation-invariant aggregation. Second, we introduce function shift learning, a novel objective that decouples the learning problem. Instead of predicting the complex global function directly, our model is trained to predict only the function shift, the discrepancy between the true global function and a simple local approximation that assumes input independence. We validate this paradigm on various circuits modalities, including Register Transfer Level graphs, And-Inverter Graphs and post-mapping netlists. Across a comprehensive suite of benchmarks, TRACE substantially outperforms all prior architectures. These results demonstrate that our architecturally-aligned backbone and decoupled learning objective form a more robust paradigm for the fundamental challenge of learning the functional behavior of a circuit graph.

Paper Structure

This paper contains 20 sections, 6 equations, 4 figures, 6 tables, 1 algorithm.

Figures (4)

  • Figure 1: The architectural failure of MPNNs on circuit graphs. Top: The permutation-invariant aggregation in MPNNs cannot distinguish between ordered inputs (e.g., $\texttt{A,B}$ vs $\texttt{B,A}$), yielding the same incorrect embedding for a position-aware operator like MUX. Bottom: Our approach processes inputs as an ordered sequence, enabling position-awareness and capturing operator-specific interactions.
  • Figure 2: Illustration of a circuit graph and its computation process. This figure demonstrates the dual role of nodes within a circuit graph. Step 1 shows nodes 1 and 2 functioning as operators to compute the expressions $S \land B$ and $\lnot S$, respectively. As computation progresses to Step 2, these nodes transition to representing the intermediate variables that hold the results of these operations in step 1, which are then passed to subsequent operators for further computation.
  • Figure 3: Overview of our proposed framework. Left: A circuit graph, represented in both a graph view and its equivalent prefix notation, is encoded by a Hierarchical Transformer to model the computation process. Right: For predictive tasks, we introduce Function Shift Learning (FSL). Instead of directly regressing the global function, the model captures the difference between the global and local functions: $y^{FSL}=y^{global}-y^{local}$.
  • Figure 4: Error Accumulation Analysis. We illustrate the MAE of logic-1 probability prediction across different logic levels. Levels containing fewer than 10 nodes are excluded for better visualization. Top: Results on AIGs. Bottom: Results on PM netlists.

Theorems & Definitions (2)

  • Definition 1: Global Function
  • Definition 2: Local Function