Expression Syntax Information Bottleneck for Math Word Problems

Jing Xiong; Chengming Li; Min Yang; Xiping Hu; Bin Hu

Expression Syntax Information Bottleneck for Math Word Problems

Jing Xiong, Chengming Li, Min Yang, Xiping Hu, Bin Hu

TL;DR

Math Word Problems are vulnerable to spurious correlations between surface cues and solution expressions. ESIB applies the variational information bottleneck to learn a concise latent $z$ that preserves predictive information about the solution $y$ while minimizing $I(x; z)$, and it uses mutual learning between two problem representations to enforce consistent expression-syntax information. A self-distillation loss $\mathcal{V}_{SDL}$ further promotes diverse, syntax-consistent solution expressions. On four benchmarks (Math23K, Ape210K, MAWPS, CM17K), ESIB achieves state-of-the-art accuracy and generates more diverse expressions, while robustness analyses show improved resistance to adversarial perturbations. The work provides theoretical links between IB, mutual learning, and generalization/robustness in MWP.

Abstract

Math Word Problems (MWP) aims to automatically solve mathematical questions given in texts. Previous studies tend to design complex models to capture additional information in the original text so as to enable the model to gain more comprehensive features. In this paper, we turn our attention in the opposite direction, and work on how to discard redundant features containing spurious correlations for MWP. To this end, we design an Expression Syntax Information Bottleneck method for MWP (called ESIB) based on variational information bottleneck, which extracts essential features of expression syntax tree while filtering latent-specific redundancy containing syntax-irrelevant features. The key idea of ESIB is to encourage multiple models to predict the same expression syntax tree for different problem representations of the same problem by mutual learning so as to capture consistent information of expression syntax tree and discard latent-specific redundancy. To improve the generalization ability of the model and generate more diverse expressions, we design a self-distillation loss to encourage the model to rely more on the expression syntax information in the latent space. Experimental results on two large-scale benchmarks show that our model not only achieves state-of-the-art results but also generates more diverse solutions. The code is available in https://github.com/menik1126/math_ESIB.

Expression Syntax Information Bottleneck for Math Word Problems

TL;DR

Math Word Problems are vulnerable to spurious correlations between surface cues and solution expressions. ESIB applies the variational information bottleneck to learn a concise latent

that preserves predictive information about the solution

while minimizing

, and it uses mutual learning between two problem representations to enforce consistent expression-syntax information. A self-distillation loss

further promotes diverse, syntax-consistent solution expressions. On four benchmarks (Math23K, Ape210K, MAWPS, CM17K), ESIB achieves state-of-the-art accuracy and generates more diverse expressions, while robustness analyses show improved resistance to adversarial perturbations. The work provides theoretical links between IB, mutual learning, and generalization/robustness in MWP.

Abstract

Paper Structure (40 sections, 17 equations, 1 figure, 6 tables)

This paper contains 40 sections, 17 equations, 1 figure, 6 tables.

Introduction
Related Work
Math Word Problem Solving
Information Bottleneck
Variational Information Bottleneck
Extensions and Variants
Applications in NLP
Mutual Learning and Knowledge Distillation
Methodology
Encoder-Compressor-Decoder Architecture
Encoder
Compressor
Decoder
Information Bottleneck
Latent-specific Redundancy
...and 25 more sections

Figures (1)

Figure 1: Overview of the proposed method ESIB.

Expression Syntax Information Bottleneck for Math Word Problems

TL;DR

Abstract

Expression Syntax Information Bottleneck for Math Word Problems

Authors

TL;DR

Abstract

Table of Contents

Figures (1)