Learning Structure-Aware Representations of Dependent Types

Konstantinos Kogkalidis; Orestis Melkonian; Jean-Philippe Bernardy

Learning Structure-Aware Representations of Dependent Types

Konstantinos Kogkalidis, Orestis Melkonian, Jean-Philippe Bernardy

TL;DR

A novel neural architecture targeted at faithfully representing dependently-typed programs on the basis of structural rather than nominal principles is proposed, which achieves promising initial results, surpassing strong baselines.

Abstract

Agda is a dependently-typed programming language and a proof assistant, pivotal in proof formalization and programming language theory. This paper extends the Agda ecosystem into machine learning territory, and, vice versa, makes Agda-related resources available to machine learning practitioners. We introduce and release a novel dataset of Agda program-proofs that is elaborate and extensive enough to support various machine learning applications -- the first of its kind. Leveraging the dataset's ultra-high resolution, which details proof states at the sub-type level, we propose a novel neural architecture targeted at faithfully representing dependently-typed programs on the basis of structural rather than nominal principles. We instantiate and evaluate our architecture in a premise selection setup, where it achieves promising initial results, surpassing strong baselines.

Learning Structure-Aware Representations of Dependent Types

TL;DR

Abstract

Paper Structure (41 sections, 3 equations, 6 figures, 2 tables)

This paper contains 41 sections, 3 equations, 6 figures, 2 tables.

Introduction
Contributions
Machine Learning for Agda
Modeling Type Structure
Background
Proof Assistants
Related Work
Data
Implementation
Extracted Structure
Typing Context
Term Structure
Modeling
Preliminaries
Desiderata
...and 26 more sections

Figures (6)

Figure 1: Agda code formalizing the commutativity of addition.
Figure 2: Tokenized view of lemma types from Figure \ref{['figure:agda-example']}; see Appendix \ref{['appendix:tokenization']} for details.
Figure 3: Empirical distributions of the standardized scores of relevant and irrelevant lemmas from the stdlib:id evaluation set.
Figure 4: JSON extract of the Agda code of Figure \ref{['figure:agda-example']}.
Figure 5: Statistics of the extracted data relevant to our experimental setup.
...and 1 more figures

Learning Structure-Aware Representations of Dependent Types

TL;DR

Abstract

Learning Structure-Aware Representations of Dependent Types

Authors

TL;DR

Abstract

Table of Contents

Figures (6)