Opinion Mining, Sentiment Analysis, and Opinion Spam Detection

Feature Based Opinion Mining and Summarization (With Annotated Datasets)
Detecting Fake Reviews


Textbook: Web Data Mining - Exploring Hyperlinks, Contents and Usage Data, by Bing Liu (Chapter 11, Opinion Mining).

See "Feature-Based Opinion Mining" in Microsoft Live Search (9/28/2007) and the new Bing Search Engine

NLP Handbook Chapter (36 pages): Sentiment Analysis and Subjectivity

Tutorial: Opinion Mining and Summarization - Sentiment Analysis, WWW-2008.

Some Talks and Tutorials on the Topic

1. Introduction

This work is in the general area of sentiment analysis, opinion extraction or opinion mining, and feature-based opinion summarization from the user-generated content or user-generated media on the Web, e.g., reviews, forum and group discussions, and blogs. The area is also closely related to sentiment classification. Our current work is in two main areas, which reflect two kinds of opinions (or evaluations)

Recently, we also started to work on review and opinion spam analysis and detection, i.e., detecting untruthful or fake reviews. See the papers [Jinal and Liu, WWW-2007, and WSDM-2008]

Acknowledgement: This project is partially funded by Microsoft Corporation.

2. Opinion mining or extraction from customer reviews

It is a common practice that merchants selling products on the Web ask their customers to review the products and associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make a decision whether to buy the product. It also makes it difficult for the manufacturer of the product to keep track and manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some sentences of the original sentences from the reviews to capture the main points as in the classic text summarization. For researchers, we always want to have an abstraction of the problem. Here it is.

Abstraction of the problem: Feature-based opinion summary of multiple reviews (KDD-04 and WWW-05)
Formal definitions can be found in my book "Web Data Mining". They are based on several of our papers in 2004 and 2005. The abstraction provides a model of reviews (or online opinions), describes what should be extracted from opinion sources (e.g., consumer reviews, forums, and blogs) and how the results may be organized and presented to the user. The main mining tasks are:

We have proposed several techniques to perform these tasks.

3. Comparative sentence and relation mining

A comparative sentence usually expresses an ordering relation between two sets of entities with respect to some common features. For example, the comparative sentence "Canon's optics are better than those of Sony and Nikon" expresses the comparative relation: (better, {optics}, {Canon}, {Sony, Nikon}). Comparative sentences use different language constructs from typical opinion sentences (e.g., "Cannon's optic is great").

Abstraction of the problem: Extraction of comparative relations, i.e., "who is better than who on what". Again, the formal definitions can be found in my book "Web Data Mining". The main mining tasks are:
This problem has many applications. For example, a product manufacturer may want to know customer opinions of its products in comparison with those of its competitors.

Data Sets

Publications

  1. Ramanathan Narayanan, Bing Liu and Alok Choudhary. "Sentiment Analysis of Conditional Sentences." To appear in Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP-09). August 6-7, 2009. Singapore.

  2. Guang Qiu, Bing Liu, Jiajun Bu and Chun Chen. "Expanding Domain Sentiment Lexicon through Double Propagation." To appear in Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI-09), Pasadena, California, USA, July 11-17, 2009.

  3. Xiaowen Ding, Bing Liu and Lei Zhang. "Entity Discovery and Assignment for Opinion Mining Applications," To appear in Proceedings of ACM SIGKDD Interntaional Conference on Knowledge Disocvery and Data Mining (KDD-09, industrial track), June 28-July 1, 2009, Paris.

  4. Bing Liu. "Sentiment Anlaysis and Subjectivity" Invited Chapter for the Handbook of Natural Language Processing, Second Edition. To appear in Oct/Nov, 2009.

  5. Bing Liu. "Opinion Mining." Invited contribution to Encyclopedia of Database Systems, 2008.

  6. Murthy Ganapathibhotla and Bing Liu. "Mining Opinions in Comparative Sentences" To appear in Proceedings of the 22nd International Conference on Computational Linguistics (Coling-2008), Manchester, 18-22 August, 2008. [Ready Soon]

  7. Xiaowen Ding, Bing Liu and Philip S. Yu. "A Holistic Lexicon-Based Appraoch to Opinion Mining." Proceedings of First ACM International Conference on Web Search and Data Mining (WSDM-2008), Feb 11-12, 2008, Stanford University, Stanford, California, USA. [Ready Soon]

  8. Nitin Jindal and Bing Liu. "Opinion Spam and Analysis." Proceedings of First ACM International Conference on Web Search and Data Mining (WSDM-2008), Feb 11-12, 2008, Stanford University, Stanford, California, USA. [Ready Soon]

  9. Nitin Jindal and Bing Liu. "Review Spam Detection." In Proceedings of WWW-2007 (poster paper), May 8-12, Banff, Canada. [PDF]

  10. Xiaowen Ding and Bing Liu. "The Utility of Linguistic Rules in Opinion Mining." SIGIR-2007 (poster paper), 23-27 July 2007, Amsterdam. [PDF]

  11. Nitin Jindal and Bing Liu. "Identifying Comparative Sentences in Text Documents" Proceedings of the 29th Annual International ACM SIGIR Conference on Research & Development on Information Retrieval (SIGIR-06), Seattle 2006. [PDF]

  12. Nitin Jindal and Bing Liu. "Mining Comprative Sentences and Relations." Proceedings of 21st National Conference on Artificial Intellgience (AAAI-2006), July 16.20, 2006, Boston, Massachusetts, USA. [PDF]

  13. Bing Liu, Minqing Hu and Junsheng Cheng. "Opinion Observer: Analyzing and Comparing Opinions on the Web" Proceedings of the 14th international World Wide Web conference (WWW-2005), May 10-14, 2005, in Chiba, Japan. [PDF]

  14. Minqing Hu and Bing Liu. "Mining and summarizing customer reviews". Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD-2004, full paper), Seattle, Washington, USA, Aug 22-25, 2004. [PDF]

  15. Minqing Hu and Bing Liu. "Mining Opinion Features in Customer Reviews." Proceedings of Nineteeth National Conference on Artificial Intellgience (AAAI-2004), San Jose, USA, July 2004. [PDF]

Created on May 15, 2004 by Bing Liu; and Minqing Hu.