Learning Continuous Semantic Representations of Symbolic Expressions
Miltiadis Allamanis, Pankajan Chanthirasegaran, Pushmeet Kohli, Charles Sutton
TL;DR
This work tackles the problem of learning continuous semantic representations for symbolic expressions by mapping semantically equivalent expressions to nearby vectors. It introduces neural equivalence networks (EqNets), which extend TreeNN-style recursion with multi-layer per-node mappings, unit normalization, and a subexpression autoencoder to encourage abstraction and reversibility. Empirical results on diverse boolean and polynomial expression datasets show that EqNets dramatically outperform TreeNN and RNN baselines, including generalization to unseen equivalence classes. The study provides benchmarks, analyzes compositionality and component contributions, and releases data and code, offering a path toward semantics aware neural-symbolic reasoning.
Abstract
Combining abstract, symbolic reasoning with continuous neural reasoning is a grand challenge of representation learning. As a step in this direction, we propose a new architecture, called neural equivalence networks, for the problem of learning continuous semantic representations of algebraic and logical expressions. These networks are trained to represent semantic equivalence, even of expressions that are syntactically very different. The challenge is that semantic representations must be computed in a syntax-directed manner, because semantics is compositional, but at the same time, small changes in syntax can lead to very large changes in semantics, which can be difficult for continuous neural architectures. We perform an exhaustive evaluation on the task of checking equivalence on a highly diverse class of symbolic algebraic and boolean expression types, showing that our model significantly outperforms existing architectures.
