Table of Contents
Fetching ...

Logical Natural Language Generation from Open-Domain Tables

Wenhu Chen, Jianshu Chen, Yu Su, Zhiyu Chen, William Yang Wang

TL;DR

This work introduces logical NLG, a task that generates statements entailed by open-domain tables rather than mere surface descriptions. The authors construct LogicNLG on top of TabFact, define automatic fidelity metrics (parsing-based, NLI-based, adversarial), and explore a spectrum of baselines including non-pretrained, pretrained, and coarse-to-fine architectures. They demonstrate that pretrained language models substantially improve fluency and fidelity, while adversarial and RL-based approaches trade fluency for fidelity; a coarse-to-fine strategy partially mitigates fidelity gaps. The paper provides comprehensive automatic and human evaluations, analyzes various logical operations, and offers a practical LogicNLG benchmark and codebase to spur future research in logic-aware NLG.

Abstract

Neural natural language generation (NLG) models have recently shown remarkable progress in fluency and coherence. However, existing studies on neural NLG are primarily focused on surface-level realizations with limited emphasis on logical inference, an important aspect of human thinking and language. In this paper, we suggest a new NLG task where a model is tasked with generating natural language statements that can be \emph{logically entailed} by the facts in an open-domain semi-structured table. To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset \cite{chen2019tabfact} featured with a wide range of logical/symbolic inferences as our testbed, and propose new automatic metrics to evaluate the fidelity of generation models w.r.t.\ logical inference. The new task poses challenges to the existing monotonic generation frameworks due to the mismatch between sequence order and logical order. In our experiments, we comprehensively survey different generation architectures (LSTM, Transformer, Pre-Trained LM) trained with different algorithms (RL, Adversarial Training, Coarse-to-Fine) on the dataset and made following observations: 1) Pre-Trained LM can significantly boost both the fluency and logical fidelity metrics, 2) RL and Adversarial Training are trading fluency for fidelity, 3) Coarse-to-Fine generation can help partially alleviate the fidelity issue while maintaining high language fluency. The code and data are available at \url{https://github.com/wenhuchen/LogicNLG}.

Logical Natural Language Generation from Open-Domain Tables

TL;DR

This work introduces logical NLG, a task that generates statements entailed by open-domain tables rather than mere surface descriptions. The authors construct LogicNLG on top of TabFact, define automatic fidelity metrics (parsing-based, NLI-based, adversarial), and explore a spectrum of baselines including non-pretrained, pretrained, and coarse-to-fine architectures. They demonstrate that pretrained language models substantially improve fluency and fidelity, while adversarial and RL-based approaches trade fluency for fidelity; a coarse-to-fine strategy partially mitigates fidelity gaps. The paper provides comprehensive automatic and human evaluations, analyzes various logical operations, and offers a practical LogicNLG benchmark and codebase to spur future research in logic-aware NLG.

Abstract

Neural natural language generation (NLG) models have recently shown remarkable progress in fluency and coherence. However, existing studies on neural NLG are primarily focused on surface-level realizations with limited emphasis on logical inference, an important aspect of human thinking and language. In this paper, we suggest a new NLG task where a model is tasked with generating natural language statements that can be \emph{logically entailed} by the facts in an open-domain semi-structured table. To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset \cite{chen2019tabfact} featured with a wide range of logical/symbolic inferences as our testbed, and propose new automatic metrics to evaluate the fidelity of generation models w.r.t.\ logical inference. The new task poses challenges to the existing monotonic generation frameworks due to the mismatch between sequence order and logical order. In our experiments, we comprehensively survey different generation architectures (LSTM, Transformer, Pre-Trained LM) trained with different algorithms (RL, Adversarial Training, Coarse-to-Fine) on the dataset and made following observations: 1) Pre-Trained LM can significantly boost both the fluency and logical fidelity metrics, 2) RL and Adversarial Training are trading fluency for fidelity, 3) Coarse-to-Fine generation can help partially alleviate the fidelity issue while maintaining high language fluency. The code and data are available at \url{https://github.com/wenhuchen/LogicNLG}.

Paper Structure

This paper contains 38 sections, 6 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Table-to-text generation examples with and without implicit logical inference. Logical NLG requires a generation model to generate natural language statements that can be logically entailed by the facts in the table instead of simply restating certain superficial facts in natural language.
  • Figure 2: When making the decision at the third step, the model needs to foresee the future tokens to ensure logical consistency. There is no back-tracking once the model makes a wrong decision like "5".
  • Figure 3: Evaluation of surface-level generation vs. logical natural language generation. It suffices to use IE-based evaluation wiseman2017challengesrohrbach2018object to verify surface-level generation, but it causes either "empty triple" or "false negative" problems to verify logical NLG.
  • Figure 4: The domain distribution of LogicNLG.
  • Figure 5: The parsing-based and adversarial evaluation to measure model's correctness in logical reasoning.
  • ...and 7 more figures