Table of Contents
Fetching ...

Facts-and-Feelings: Capturing both Objectivity and Subjectivity in Table-to-Text Generation

Tathagata Dey, Pushpak Bhattacharyya

TL;DR

The paper addresses the challenge of generating natural language from tables while preserving subjectivity, introducing the Ta2TS dataset of 3849 instances across finance, weather, and sports and formalizing the problem with $T$ (table) to $S$ (text) via $P(S|T;\theta)$ and autoregressive decoding $s_i=\arg\max P(s_i|Y,s_1,...;\theta)$. It compares fine-tuned T5 sequence-to-sequence models on linearized tables with prompting of large language models (GPT-3.5-turbo, Mistral, Llama-2), using both automatic metrics (BLEU-4, METEOR, Rouge-L, BERTScore) and human evaluations of coherence, coverage, accuracy, and subjectivity capture. Key findings show that context-rich T5 models can approach GPT-3.5-turbo performance, while LLM prompting offers strong coverage and subjectivity control, with 3-shot prompts often performing best; Mistral-7B and Llama-2 underperform in this task. The work provides the first comprehensive, multi-genre benchmark for subjectivity-infused table-to-text generation and offers a baseline for future methods combining table encoders with generation components and expanded datasets.

Abstract

Table-to-text generation, a long-standing challenge in natural language generation, has remained unexplored through the lens of subjectivity. Subjectivity here encompasses the comprehension of information derived from the table that cannot be described solely by objective data. Given the absence of pre-existing datasets, we introduce the Ta2TS dataset with 3849 data instances. We perform the task of fine-tuning sequence-to-sequence models on the linearized tables and prompting on popular large language models. We analyze the results from a quantitative and qualitative perspective to ensure the capture of subjectivity and factual consistency. The analysis shows the fine-tuned LMs can perform close to the prompted LLMs. Both the models can capture the tabular data, generating texts with 85.15% BERTScore and 26.28% Meteor score. To the best of our knowledge, we provide the first-of-its-kind dataset on tables with multiple genres and subjectivity included and present the first comprehensive analysis and comparison of different LLM performances on this task.

Facts-and-Feelings: Capturing both Objectivity and Subjectivity in Table-to-Text Generation

TL;DR

The paper addresses the challenge of generating natural language from tables while preserving subjectivity, introducing the Ta2TS dataset of 3849 instances across finance, weather, and sports and formalizing the problem with (table) to (text) via and autoregressive decoding . It compares fine-tuned T5 sequence-to-sequence models on linearized tables with prompting of large language models (GPT-3.5-turbo, Mistral, Llama-2), using both automatic metrics (BLEU-4, METEOR, Rouge-L, BERTScore) and human evaluations of coherence, coverage, accuracy, and subjectivity capture. Key findings show that context-rich T5 models can approach GPT-3.5-turbo performance, while LLM prompting offers strong coverage and subjectivity control, with 3-shot prompts often performing best; Mistral-7B and Llama-2 underperform in this task. The work provides the first comprehensive, multi-genre benchmark for subjectivity-infused table-to-text generation and offers a baseline for future methods combining table encoders with generation components and expanded datasets.

Abstract

Table-to-text generation, a long-standing challenge in natural language generation, has remained unexplored through the lens of subjectivity. Subjectivity here encompasses the comprehension of information derived from the table that cannot be described solely by objective data. Given the absence of pre-existing datasets, we introduce the Ta2TS dataset with 3849 data instances. We perform the task of fine-tuning sequence-to-sequence models on the linearized tables and prompting on popular large language models. We analyze the results from a quantitative and qualitative perspective to ensure the capture of subjectivity and factual consistency. The analysis shows the fine-tuned LMs can perform close to the prompted LLMs. Both the models can capture the tabular data, generating texts with 85.15% BERTScore and 26.28% Meteor score. To the best of our knowledge, we provide the first-of-its-kind dataset on tables with multiple genres and subjectivity included and present the first comprehensive analysis and comparison of different LLM performances on this task.
Paper Structure (28 sections, 8 figures, 6 tables)

This paper contains 28 sections, 8 figures, 6 tables.

Figures (8)

  • Figure 1: Generating text with subjectivity from a table: The table contains the income statement report of a company over 5 years. The reference text below describes the tabular information where the bold phrases refer to the infused subjectivity. This subjectivity is the interpretation of the data from a human perspective. (Due to space constraints, shown in figure \ref{['fig:subobj2']})
  • Figure 2: Architecture: The diagram shows two parallel methods of experimentation we adopted. In the top experiment, the tables are linearized (section \ref{['linearization']}). Metadata is the genre of the table, company or district name, time and additional details. Different types of prefixes are given based on the context (section \ref{['prefixtypes']}). Different T5 models are used with different prefix types. In the bottom experimentation, some popular LLMs are prompted. A system prompt is generated by formally defining the task. Multiple examples are given in few-shot methods along with the metadata and reference text. Finally generated texts by all of these methods are compared and analysed.
  • Figure 3: Example of texts generated by T5-large (pf-ct) and GPT 3.5 (3-shot): In the figure the given table and reference text (ground truth) are taken from the Ta2TS dataset and at the bottom the generated texts are shown.
  • Figure 4: Generating text with subjectivity: The table shows different teams and their relative performances in a tournament. The table is described by the reference text, enriched with subjectivity. It is an example instance table from the sports domain of the Ta2TS dataset. In the table the abbreviations denote the following; M:Matches; W:Wins; L:Losses; T:Ties; PT:Points.
  • Figure 5: Generating text with subjectivity: The table contains data on the weather forecast in the district of Kolkata from 6th February 2024 to 11th February 2024. The subjective reference text describes the table. This example data instance is taken from the Ta2TS dataset.
  • ...and 3 more figures