Table of Contents
Fetching ...

CSA-Trans: Code Structure Aware Transformer for AST

Saeyoon Oh, Shin Yoo

TL;DR

CSA-Trans addresses the challenge of encoding structural context from ASTs in code summarization by learning a Code Structure Aware Positional Encoding (CSA-PE) with a Code Structure Embedder and applying Stochastic Block Model (SBM) attention to enable a global yet efficient receptive field. The approach yields richer node-context representations and results in more node-specific attention patterns, validated by synthetic and real-data experiments. Empirical evaluation on Java and Python shows CSA-Trans surpasses 14 baselines while offering improvements in time and memory efficiency compared to AST-Trans and SG-Trans, with ablations confirming the contributions of CSA-PE and SBM attention. The findings highlight the value of AST-informed encodings and dynamic, interpretable attention in program understanding tasks, offering practical benefits for automatic code summarization.

Abstract

When applying the Transformer architecture to source code, designing a good self-attention mechanism is critical as it affects how node relationship is extracted from the Abstract Syntax Trees (ASTs) of the source code. We present Code Structure Aware Transformer (CSA-Trans), which uses Code Structure Embedder (CSE) to generate specific PE for each node in AST. CSE generates node Positional Encoding (PE) using disentangled attention. To further extend the self-attention capability, we adopt Stochastic Block Model (SBM) attention. Our evaluation shows that our PE captures the relationships between AST nodes better than other graph-related PE techniques. We also show through quantitative and qualitative analysis that SBM attention is able to generate more node specific attention coefficients. We demonstrate that CSA-Trans outperforms 14 baselines in code summarization tasks for both Python and Java, while being 41.92% faster and 25.31% memory efficient in Java dataset compared to AST-Trans and SG-Trans respectively.

CSA-Trans: Code Structure Aware Transformer for AST

TL;DR

CSA-Trans addresses the challenge of encoding structural context from ASTs in code summarization by learning a Code Structure Aware Positional Encoding (CSA-PE) with a Code Structure Embedder and applying Stochastic Block Model (SBM) attention to enable a global yet efficient receptive field. The approach yields richer node-context representations and results in more node-specific attention patterns, validated by synthetic and real-data experiments. Empirical evaluation on Java and Python shows CSA-Trans surpasses 14 baselines while offering improvements in time and memory efficiency compared to AST-Trans and SG-Trans, with ablations confirming the contributions of CSA-PE and SBM attention. The findings highlight the value of AST-informed encodings and dynamic, interpretable attention in program understanding tasks, offering practical benefits for automatic code summarization.

Abstract

When applying the Transformer architecture to source code, designing a good self-attention mechanism is critical as it affects how node relationship is extracted from the Abstract Syntax Trees (ASTs) of the source code. We present Code Structure Aware Transformer (CSA-Trans), which uses Code Structure Embedder (CSE) to generate specific PE for each node in AST. CSE generates node Positional Encoding (PE) using disentangled attention. To further extend the self-attention capability, we adopt Stochastic Block Model (SBM) attention. Our evaluation shows that our PE captures the relationships between AST nodes better than other graph-related PE techniques. We also show through quantitative and qualitative analysis that SBM attention is able to generate more node specific attention coefficients. We demonstrate that CSA-Trans outperforms 14 baselines in code summarization tasks for both Python and Java, while being 41.92% faster and 25.31% memory efficient in Java dataset compared to AST-Trans and SG-Trans respectively.
Paper Structure (32 sections, 11 equations, 8 figures, 9 tables)

This paper contains 32 sections, 11 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Program ASTs
  • Figure 2: CSA-Trans Architecture
  • Figure 3: Intermediate Node Prediction: 3-INP prediction
  • Figure 4: Relationships induced by SBM mask
  • Figure 5: Attention Masks generated by SBM attention.
  • ...and 3 more figures