Table of Contents
Fetching ...

Learning Continuous Semantic Representations of Symbolic Expressions

Miltiadis Allamanis, Pankajan Chanthirasegaran, Pushmeet Kohli, Charles Sutton

TL;DR

This work tackles the problem of learning continuous semantic representations for symbolic expressions by mapping semantically equivalent expressions to nearby vectors. It introduces neural equivalence networks (EqNets), which extend TreeNN-style recursion with multi-layer per-node mappings, unit normalization, and a subexpression autoencoder to encourage abstraction and reversibility. Empirical results on diverse boolean and polynomial expression datasets show that EqNets dramatically outperform TreeNN and RNN baselines, including generalization to unseen equivalence classes. The study provides benchmarks, analyzes compositionality and component contributions, and releases data and code, offering a path toward semantics aware neural-symbolic reasoning.

Abstract

Combining abstract, symbolic reasoning with continuous neural reasoning is a grand challenge of representation learning. As a step in this direction, we propose a new architecture, called neural equivalence networks, for the problem of learning continuous semantic representations of algebraic and logical expressions. These networks are trained to represent semantic equivalence, even of expressions that are syntactically very different. The challenge is that semantic representations must be computed in a syntax-directed manner, because semantics is compositional, but at the same time, small changes in syntax can lead to very large changes in semantics, which can be difficult for continuous neural architectures. We perform an exhaustive evaluation on the task of checking equivalence on a highly diverse class of symbolic algebraic and boolean expression types, showing that our model significantly outperforms existing architectures.

Learning Continuous Semantic Representations of Symbolic Expressions

TL;DR

This work tackles the problem of learning continuous semantic representations for symbolic expressions by mapping semantically equivalent expressions to nearby vectors. It introduces neural equivalence networks (EqNets), which extend TreeNN-style recursion with multi-layer per-node mappings, unit normalization, and a subexpression autoencoder to encourage abstraction and reversibility. Empirical results on diverse boolean and polynomial expression datasets show that EqNets dramatically outperform TreeNN and RNN baselines, including generalization to unseen equivalence classes. The study provides benchmarks, analyzes compositionality and component contributions, and releases data and code, offering a path toward semantics aware neural-symbolic reasoning.

Abstract

Combining abstract, symbolic reasoning with continuous neural reasoning is a grand challenge of representation learning. As a step in this direction, we propose a new architecture, called neural equivalence networks, for the problem of learning continuous semantic representations of algebraic and logical expressions. These networks are trained to represent semantic equivalence, even of expressions that are syntactically very different. The challenge is that semantic representations must be computed in a syntax-directed manner, because semantics is compositional, but at the same time, small changes in syntax can lead to very large changes in semantics, which can be difficult for continuous neural architectures. We perform an exhaustive evaluation on the task of checking equivalence on a highly diverse class of symbolic algebraic and boolean expression types, showing that our model significantly outperforms existing architectures.

Paper Structure

This paper contains 21 sections, 4 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: EqNet architecture.
  • Figure 2: Precision-Recall Curves averaged across datasets.
  • Figure 3: Evaluation of compositionality; training set simpler than test set. Average $score_{k}$ ($y$-axis in log-scale). Markers are shown every three ticks for clarity. TreeNN refers to socher2012semantic.
  • Figure 4: Visualization of $score_{5}$ for all expression nodes for three Bool10 and four Poly8 test sample expressions using EqNet. The darker the color, the lower the score, i.e. white implies a score of 1 and dark red a score of 0.
  • Figure 5: A PCA visualization of some simple non-equivalent boolean and polynomial expressions (black-square) and their negations (red-circle). The lines connect the negated expressions.
  • ...and 4 more figures