Opinion Spam Detection: Detecting Fake Reviews and Reviewers

Many names: Spam Review, Fake Review, or Bogus Review
Opinion Spammer, Review Spammer, or Fake Reviewer
Deception, Deceptive Message

Introduction

It has become a common practice for people to find and to read opinions/reviews on the Web for many purposes. For example, if one wants to buy a product, one typically goes to a merchant or review site (e.g., amazon.com) to read some reviews of existing users of the product. If one sees many positive reviews of the product, one is very likely to buy the product. However, if one sees many negative reviews, he/she will most likely choose another product. Positive opinions can result in significant financial gains and/or fames for organizations and individuals. This, unfortunately, gives good incentives for opinion spam.

Opinion Spam: Opinion spamming refers to "illegal" activities (e.g., writing fake reviews) that try to deliberately mislead readers or automated opinion mining and sentiment analysis systems by giving undeserving positive opinions to some target entities in order to promote the entities and/or by giving false negative opinions to some other entities in order to damage their reputation. Opinion spam comes in many forms, e.g., fake reviews (also called bogus reviews), fake comments, fake blogs, fake social network postings, deceptions, and deceptive messages. Manually spotting such postings is very hard, but there are several pages on the Web (see below) which tell people how to spot fake reviews and deceptive messages. To the best of our knowledge, our group was the first in academia to conduct the research to detect fake reviews and reviewers. Our first paper was published in 2007, and subsequent papers were published in 2008 and 2010. My textbook Web Data Mining also has a section in Chapter 11 discussing the issue (Springer, Second Edition, July 2011; First Edition, Dec, 2006). The objective of our current project is to detect fake reviews. We have not worked on detecting other forms of spam opinions.

Fake review detection: We have used both supervised and unsupervised methods for the task. Three main types of features or signals are:
  1. Review contents: Words and other linguistic features
  2. Reviewer abnormal behaviors: Do you see anything wrong with the reviews of this person: Big John? What about after seeing the reviews of these two persons: Cletus and Jake? This is just one example of atypical behaviors that our algorithm is able to discover.
  3. Product features: For example, product decriptions and sales ranks

We can safely predict that as opinions on the Web are increasingly used in practice by consumers, organizations, and businesses for their decision making, opinion spam will get worse and also more sophisticated. Detecting spam reviews or opinions will become more and more critical. The situation is already quite bad. When I have time, I will write more about it. You can also have a look at our papers.

Acknowledgement: This project has been funded by Microsoft and Google

Some Fake Review Cases in the News

Professional Fake Review Writing Services

How to Spot Fake Reviews Manually

Manipulating Social Media (sock puppets - fake identities - fake personas)

China's Internet "Water Army" (Shuijun) - Opinion Spammers

Differences from Web Spam and Email Spam

Types of Opinion Spam

There are generally three types of spam reviews (Jindal and Liu WSDM-2008):

Type 2 and Type 3 spam are rare, but Type 1 spam reviews are wide-spread and very hard to detect. Some fake reviews are not so harmful, but some are very harmful. See details in (Jindal and Liu WSDM-2008) or Chapter 11 of my book Web Data Mining.

Data Sets

Publications

  1. Arjun Mukherjee, Bing Liu, Junhui Wang, Natalie Glance, Nitin Jindal. Detecting Group Review Spam. WWW-2011 poster paper, 2011.

  2. Nitin Jindal, Bing Liu and Ee-Peng Lim. "Finding Unusual Review Patterns Using Unexpected Rules" Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM-2010, short paper), Toronto, Canada, Oct 26 - 30, 2010.

  3. Ee-Peng Lim, Viet-An Nguyen, Nitin Jindal, Bing Liu and Hady Lauw. "Detecting Product Review Spammers using Rating Behaviors." Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM-2010, full paper), Toronto, Canada, Oct 26 - 30, 2010.

  4. Nitin Jindal and Bing Liu. "Opinion Spam and Analysis." Proceedings of First ACM International Conference on Web Search and Data Mining (WSDM-2008), Feb 11-12, 2008, Stanford University, Stanford, California, USA.

  5. Nitin Jindal and Bing Liu. "Review Spam Detection." Proceedings of WWW-2007 (poster paper), May 8-12, Banff, Canada.

Created by Bing Liu, 2008.