Table of Contents
Fetching ...

InsightNet: Structured Insight Mining from Customer Feedback

Sandeep Sricharan Mukku, Manan Soni, Jitenkumar Rana, Chetan Aggarwal, Promod Yenigalla, Rashmi Patange, Shyam Mohan

TL;DR

InsightNet introduces a semi-supervised, multi-task framework for extracting structured insights from customer reviews by jointly identifying granular topics, polarities, and verbatim segments, while AutoTaxonomy provides a hierarchical topic taxonomy and SegmentNet generates weakly labeled data to fine-tune a generative model. The approach leverages decomposed prompting, semantic similarity, and post-processing to produce coherent, non-redundant outputs that cover both seen and unseen topics. Empirical results across 43 categories show an 11% F1-score improvement over state-of-the-art baselines and an 85% F1-score on topic classification, with SegmentNet enabling competitive data labeling and AmaT5 pre-training boosting transferability. The work demonstrates strong scalability and practical impact for structured insight mining, with potential extensions to multilingual and multimodal settings.

Abstract

We propose InsightNet, a novel approach for the automated extraction of structured insights from customer reviews. Our end-to-end machine learning framework is designed to overcome the limitations of current solutions, including the absence of structure for identified topics, non-standard aspect names, and lack of abundant training data. The proposed solution builds a semi-supervised multi-level taxonomy from raw reviews, a semantic similarity heuristic approach to generate labelled data and employs a multi-task insight extraction architecture by fine-tuning an LLM. InsightNet identifies granular actionable topics with customer sentiments and verbatim for each topic. Evaluations on real-world customer review data show that InsightNet performs better than existing solutions in terms of structure, hierarchy and completeness. We empirically demonstrate that InsightNet outperforms the current state-of-the-art methods in multi-label topic classification, achieving an F1 score of 0.85, which is an improvement of 11% F1-score over the previous best results. Additionally, InsightNet generalises well for unseen aspects and suggests new topics to be added to the taxonomy.

InsightNet: Structured Insight Mining from Customer Feedback

TL;DR

InsightNet introduces a semi-supervised, multi-task framework for extracting structured insights from customer reviews by jointly identifying granular topics, polarities, and verbatim segments, while AutoTaxonomy provides a hierarchical topic taxonomy and SegmentNet generates weakly labeled data to fine-tune a generative model. The approach leverages decomposed prompting, semantic similarity, and post-processing to produce coherent, non-redundant outputs that cover both seen and unseen topics. Empirical results across 43 categories show an 11% F1-score improvement over state-of-the-art baselines and an 85% F1-score on topic classification, with SegmentNet enabling competitive data labeling and AmaT5 pre-training boosting transferability. The work demonstrates strong scalability and practical impact for structured insight mining, with potential extensions to multilingual and multimodal settings.

Abstract

We propose InsightNet, a novel approach for the automated extraction of structured insights from customer reviews. Our end-to-end machine learning framework is designed to overcome the limitations of current solutions, including the absence of structure for identified topics, non-standard aspect names, and lack of abundant training data. The proposed solution builds a semi-supervised multi-level taxonomy from raw reviews, a semantic similarity heuristic approach to generate labelled data and employs a multi-task insight extraction architecture by fine-tuning an LLM. InsightNet identifies granular actionable topics with customer sentiments and verbatim for each topic. Evaluations on real-world customer review data show that InsightNet performs better than existing solutions in terms of structure, hierarchy and completeness. We empirically demonstrate that InsightNet outperforms the current state-of-the-art methods in multi-label topic classification, achieving an F1 score of 0.85, which is an improvement of 11% F1-score over the previous best results. Additionally, InsightNet generalises well for unseen aspects and suggests new topics to be added to the taxonomy.
Paper Structure (36 sections, 11 equations, 5 figures, 8 tables, 5 algorithms)

This paper contains 36 sections, 11 equations, 5 figures, 8 tables, 5 algorithms.

Figures (5)

  • Figure 1: Decomposed Sequential Prompting - InsightNet
  • Figure 2: InsightNet Prompting
  • Figure 3: SegmentNet pipeline
  • Figure 4: SegmentNet Data Ablation
  • Figure 5: Heavy tailed distribution of topics