Table of Contents
Fetching ...

A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts

Bo Pang, Lillian Lee

TL;DR

The paper tackles improving sentiment polarity classification by isolating subjective content in reviews. It introduces a minimum-cut graph formulation to fuse per-sentence subjectivity signals with cross-sentence proximity constraints, producing compact subjectivity extracts. When these extracts are used with NB or SVM polarity classifiers, they achieve equal or better accuracy with far fewer words, and context-aware graph cuts offer additional gains. This approach enables efficient, context-sensitive sentiment analysis and contributes to effective sentiment summarization methods.

Abstract

Sentiment analysis seeks to identify the viewpoint(s) underlying a text span; an example application is classifying a movie review as "thumbs up" or "thumbs down". To determine this sentiment polarity, we propose a novel machine-learning method that applies text-categorization techniques to just the subjective portions of the document. Extracting these portions can be implemented using efficient techniques for finding minimum cuts in graphs; this greatly facilitates incorporation of cross-sentence contextual constraints.

A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts

TL;DR

The paper tackles improving sentiment polarity classification by isolating subjective content in reviews. It introduces a minimum-cut graph formulation to fuse per-sentence subjectivity signals with cross-sentence proximity constraints, producing compact subjectivity extracts. When these extracts are used with NB or SVM polarity classifiers, they achieve equal or better accuracy with far fewer words, and context-aware graph cuts offer additional gains. This approach enables efficient, context-sensitive sentiment analysis and contributes to effective sentiment summarization methods.

Abstract

Sentiment analysis seeks to identify the viewpoint(s) underlying a text span; an example application is classifying a movie review as "thumbs up" or "thumbs down". To determine this sentiment polarity, we propose a novel machine-learning method that applies text-categorization techniques to just the subjective portions of the document. Extracting these portions can be implemented using efficient techniques for finding minimum cuts in graphs; this greatly facilitates incorporation of cross-sentence contextual constraints.

Paper Structure

This paper contains 10 sections, 2 equations, 5 figures.

Figures (5)

  • Figure 1: Polarity classification via subjectivity detection.
  • Figure 2: Graph for classifying three items. Brackets enclose example values; here, the individual scores happen to be probabilities. Based on individual scores alone, we would put $Y$ ("yes") in $C_1$, $N$ ("no") in $C_2$, and be undecided about $M$ ("maybe"). But the association scores favor cuts that put $Y$ and $M$ in the same class, as shown in the table. Thus, the minimum cut, indicated by the dashed line, places $M$ together with $Y$ in $C_1$.
  • Figure 3: Graph-cut-based creation of subjective extracts.
  • Figure 4: Accuracies using N-sentence extracts for NB (left) and SVM (right) default polarity classifiers.
  • Figure 5: Word preservation rate vs. accuracy, NB (left) and SVMs (right) as default polarity classifiers. Also indicated are results for some statistical significance tests.

Theorems & Definitions (1)

  • Definition 1