Thumbs up? Sentiment Classification using Machine Learning Techniques
Bo Pang, Lillian Lee, Shivakumar Vaithyanathan
TL;DR
The paper investigates automatic sentiment classification of online text, focusing on positive versus negative movie reviews, and evaluates supervised ML approaches on an IMDb-derived corpus to assess the relative difficulty of sentiment versus topic classification. It compares three standard bag-of-features classifiers—Naive Bayes, Maximum Entropy, and Support Vector Machines—using a unified feature representation with 700 positive and 700 negative reviews, negation tagging, and unigram/bigram features in a three-fold cross-validation setup. The results show that supervised methods outperform baselines, with SVM tending to achieve the best performance; unigram presence features are the most effective, while bigrams generally provide little or no gain and can hurt, and POS features offer only modest gains for NB but can degrade SVM. The authors argue that sentiment is subtler than topic cues, highlight the value of corpus-driven features and negation handling, and suggest future work on discourse- and sentence-level analysis to further improve sentiment detection.
Abstract
We consider the problem of classifying documents not by topic, but by overall sentiment, e.g., determining whether a review is positive or negative. Using movie reviews as data, we find that standard machine learning techniques definitively outperform human-produced baselines. However, the three machine learning methods we employed (Naive Bayes, maximum entropy classification, and support vector machines) do not perform as well on sentiment classification as on traditional topic-based categorization. We conclude by examining factors that make the sentiment classification problem more challenging.
