CSA-Trans: Code Structure Aware Transformer for AST
Saeyoon Oh, Shin Yoo
TL;DR
CSA-Trans addresses the challenge of encoding structural context from ASTs in code summarization by learning a Code Structure Aware Positional Encoding (CSA-PE) with a Code Structure Embedder and applying Stochastic Block Model (SBM) attention to enable a global yet efficient receptive field. The approach yields richer node-context representations and results in more node-specific attention patterns, validated by synthetic and real-data experiments. Empirical evaluation on Java and Python shows CSA-Trans surpasses 14 baselines while offering improvements in time and memory efficiency compared to AST-Trans and SG-Trans, with ablations confirming the contributions of CSA-PE and SBM attention. The findings highlight the value of AST-informed encodings and dynamic, interpretable attention in program understanding tasks, offering practical benefits for automatic code summarization.
Abstract
When applying the Transformer architecture to source code, designing a good self-attention mechanism is critical as it affects how node relationship is extracted from the Abstract Syntax Trees (ASTs) of the source code. We present Code Structure Aware Transformer (CSA-Trans), which uses Code Structure Embedder (CSE) to generate specific PE for each node in AST. CSE generates node Positional Encoding (PE) using disentangled attention. To further extend the self-attention capability, we adopt Stochastic Block Model (SBM) attention. Our evaluation shows that our PE captures the relationships between AST nodes better than other graph-related PE techniques. We also show through quantitative and qualitative analysis that SBM attention is able to generate more node specific attention coefficients. We demonstrate that CSA-Trans outperforms 14 baselines in code summarization tasks for both Python and Java, while being 41.92% faster and 25.31% memory efficient in Java dataset compared to AST-Trans and SG-Trans respectively.
