Table of Contents
Fetching ...

Towards a Science of Causal Interpretability in Deep Learning for Software Engineering

David N. Palacio

TL;DR

This dissertation advances causal interpretability in Deep Learning for Software Engineering (DL4SE) by defining DoCode, a four-step post hoc causal framework (SCMs, estimand, causal effects such as $ATE$, and refutation) to explain Neural Code Models beyond correlations. It grounds explanations in programming-language structure through Syntax (De)Composition and Grounded Explanations (ASTrust), and demonstrates these ideas via multi-architecture case studies on deep code generation, traceability, and information-theoretic analyses. The work also introduces COMET for software traceability, TraceXplainer for information-theoretic evaluation, and CodeQ for code-based rationales, underscoring the value of grounding model explanations in code syntax and programmer-centric concepts. Collectively, the contributions provide a formal pipeline and practical guidelines for applying causal interpretability to NCMs, aiming to improve trustworthiness, debugging, and intervention-based software engineering tasks. The research further explores interventional and counterfactual extensions, culminating in a broader vision of self-constructing software through autopoietic architectures and a consolidated research program for causal interpretability in AI for software engineering.

Abstract

This dissertation addresses achieving causal interpretability in Deep Learning for Software Engineering (DL4SE). While Neural Code Models (NCMs) show strong performance in automating software tasks, their lack of transparency in causal relationships between inputs and outputs limits full understanding of their capabilities. To build trust in NCMs, researchers and practitioners must explain code predictions. Associational interpretability, which identifies correlations, is often insufficient for tasks requiring intervention and change analysis. To address this, the dissertation introduces DoCode, a novel post hoc interpretability method for NCMs. DoCode uses causal inference to provide programming language-oriented explanations of model predictions. It follows a four-step pipeline: modeling causal problems using Structural Causal Models (SCMs), identifying the causal estimand, estimating effects with metrics like Average Treatment Effect (ATE), and refuting effect estimates. Its framework is extensible, with an example that reduces spurious correlations by grounding explanations in programming language properties. A case study on deep code generation across interpretability scenarios and various deep learning architectures demonstrates DoCode's benefits. Results show NCMs' sensitivity to code syntax changes and their ability to learn certain programming concepts while minimizing confounding bias. The dissertation also examines associational interpretability as a foundation, analyzing software information's causal nature using tools like COMET and TraceXplainer for traceability. It highlights the need to identify code confounders and offers practical guidelines for applying causal interpretability to NCMs, contributing to more trustworthy AI in software engineering.

Towards a Science of Causal Interpretability in Deep Learning for Software Engineering

TL;DR

This dissertation advances causal interpretability in Deep Learning for Software Engineering (DL4SE) by defining DoCode, a four-step post hoc causal framework (SCMs, estimand, causal effects such as , and refutation) to explain Neural Code Models beyond correlations. It grounds explanations in programming-language structure through Syntax (De)Composition and Grounded Explanations (ASTrust), and demonstrates these ideas via multi-architecture case studies on deep code generation, traceability, and information-theoretic analyses. The work also introduces COMET for software traceability, TraceXplainer for information-theoretic evaluation, and CodeQ for code-based rationales, underscoring the value of grounding model explanations in code syntax and programmer-centric concepts. Collectively, the contributions provide a formal pipeline and practical guidelines for applying causal interpretability to NCMs, aiming to improve trustworthiness, debugging, and intervention-based software engineering tasks. The research further explores interventional and counterfactual extensions, culminating in a broader vision of self-constructing software through autopoietic architectures and a consolidated research program for causal interpretability in AI for software engineering.

Abstract

This dissertation addresses achieving causal interpretability in Deep Learning for Software Engineering (DL4SE). While Neural Code Models (NCMs) show strong performance in automating software tasks, their lack of transparency in causal relationships between inputs and outputs limits full understanding of their capabilities. To build trust in NCMs, researchers and practitioners must explain code predictions. Associational interpretability, which identifies correlations, is often insufficient for tasks requiring intervention and change analysis. To address this, the dissertation introduces DoCode, a novel post hoc interpretability method for NCMs. DoCode uses causal inference to provide programming language-oriented explanations of model predictions. It follows a four-step pipeline: modeling causal problems using Structural Causal Models (SCMs), identifying the causal estimand, estimating effects with metrics like Average Treatment Effect (ATE), and refuting effect estimates. Its framework is extensible, with an example that reduces spurious correlations by grounding explanations in programming language properties. A case study on deep code generation across interpretability scenarios and various deep learning architectures demonstrates DoCode's benefits. Results show NCMs' sensitivity to code syntax changes and their ability to learn certain programming concepts while minimizing confounding bias. The dissertation also examines associational interpretability as a foundation, analyzing software information's causal nature using tools like COMET and TraceXplainer for traceability. It highlights the need to identify code confounders and offers practical guidelines for applying causal interpretability to NCMs, contributing to more trustworthy AI in software engineering.

Paper Structure

This paper contains 159 sections, 26 equations, 37 figures, 33 tables.

Figures (37)

  • Figure 1: Ladder of Causation: do$_{code}$ is an extension of the intervention level.
  • Figure 2: The Conceptual Framework of Syntax-Grounded Interpretability
  • Figure 3: Alignment & Clustering Interactions. The $\delta$ function aligns tokens $w_i$ to terminal nodes $\lambda$. Terminal and Non-terminal nodes $\lambda$, $\alpha$$\in \mathcal{N}$ are clustered by Syntax Categories $\mathcal{C}$.
  • Figure 4: Post Hoc Local Explanation. A snippet is decomposed into code tokens. The highest annotated probabilities (i. e., best predictions) are in blue.
  • Figure 5: Post Hoc Global Explanations Segregated by Categories and Sub-Categories for gpt-3 [125M] and mono-lang [2B]
  • ...and 32 more figures

Theorems & Definitions (10)

  • Definition 5.1
  • Definition 5.2
  • Definition 9.1
  • Definition 9.2
  • Definition 9.3
  • Definition 9.4
  • Definition 10.1
  • Definition 12.3.1
  • Definition 12.3.2
  • Definition 12.3.3