TrICy: Trigger-guided Data-to-text Generation with Intent aware Attention-Copy

Vibhav Agarwal; Sourav Ghosh; Harichandana BSS; Himanshu Arora; Barath Raj Kandur Raja

TrICy: Trigger-guided Data-to-text Generation with Intent aware Attention-Copy

Vibhav Agarwal, Sourav Ghosh, Harichandana BSS, Himanshu Arora, Barath Raj Kandur Raja

TL;DR

TrICy introduces a lightweight, on-device data-to-text generation framework with a dual-encoder architecture that conditions output on intent and optional trigger cues while employing an attention-based copy mechanism. The model achieves state-of-the-art BLEU on E2E NLG and competitive results on WebNLG with a fraction of the parameters of large PLMs, enabled by an integrated generate-and-copy decoding strategy. A trigger-ratio optimization technique further boosts generation quality, and user trials indicate strong practical usefulness for personalized, context-aware responses. This work demonstrates a viable path toward efficient, edge-friendly D2T systems that leverage structured data, intent signals, and user-provided triggers to produce diverse, faithful outputs. It points to broader implications for on-device assistants, synthetic data generation, and accessibility tools, while noting limitations in multilingual transfer and potential hallucination risk.

Abstract

Data-to-text (D2T) generation is a crucial task in many natural language understanding (NLU) applications and forms the foundation of task-oriented dialog systems. In the context of conversational AI solutions that can work directly with local data on the user's device, architectures utilizing large pre-trained language models (PLMs) are impractical for on-device deployment due to a high memory footprint. To this end, we propose TrICy, a novel lightweight framework for an enhanced D2T task that generates text sequences based on the intent in context and may further be guided by user-provided triggers. We leverage an attention-copy mechanism to predict out-of-vocabulary (OOV) words accurately. Performance analyses on E2E NLG dataset (BLEU: 66.43%, ROUGE-L: 70.14%), WebNLG dataset (BLEU: Seen 64.08%, Unseen 52.35%), and our Custom dataset related to text messaging applications, showcase our architecture's effectiveness. Moreover, we show that by leveraging an optional trigger input, data-to-text generation quality increases significantly and achieves the new SOTA score of 69.29% BLEU for E2E NLG. Furthermore, our analyses show that TrICy achieves at least 24% and 3% improvement in BLEU and METEOR respectively over LLMs like GPT-3, ChatGPT, and Llama 2. We also demonstrate that in some scenarios, performance improvement due to triggers is observed even when they are absent in training.

TrICy: Trigger-guided Data-to-text Generation with Intent aware Attention-Copy

TL;DR

Abstract

Paper Structure (32 sections, 11 equations, 14 figures, 8 tables)

This paper contains 32 sections, 11 equations, 14 figures, 8 tables.

Introduction
Related Work
Sequence-to-Sequence paradigm
Copying Mechanism in Seq2Seq learning
Data-to-text Generation
Model Description
Encoders
Bahdanau Attentive Read
Decoding with Generate and Copy
Decoder state
Prediction
Trigger input, K
Experimental Setup
Dataset
E2E NLG novikova-etal-2017-e2e
...and 17 more sections

Figures (14)

Figure 1: Context-aware D2T generation: based on user application data, message intent, and trigger text. Generated text may include natural language responses or markup text for downstream use cases.
Figure 3: Trigger-driven generation: User input is used as trigger to generate inline phrase completion.
Figure 4: Architecture of the proposed TrICy Model
Figure 5: Effect of our architectural choices on model parameters and vocabulary. For all models, brighter shades denote decoder parameters, stacked on top of their encoder counterparts in darker shades.
Figure 7: Determination of ${}_tr_\mathcal{K}^*$: For models trained with varying ${}_tr_\mathcal{K}$ ratios with evaluation sets -- (i) $0K$ (${}_er_\mathcal{K} = 0.0$), and (ii) $+K$ (${}_er_\mathcal{K} = 1.0$), the weighted mean graphs of (i) and (ii) are denoted by ${}^{w\%}\mu'K$, for heuristic weight $w\%$.
...and 9 more figures

TrICy: Trigger-guided Data-to-text Generation with Intent aware Attention-Copy

TL;DR

Abstract

TrICy: Trigger-guided Data-to-text Generation with Intent aware Attention-Copy

Authors

TL;DR

Abstract

Table of Contents

Figures (14)