From Web Content Mining to Natural Language Processing:
tutorial to be given at
ACL-2007,
Prague, Czech Republic, June 23-30 2007. (Opinion mining is one of the topics)
Opinion Mining and Search, Invited talk at Google, Pittsburgh,
Sept 29, 2006. (Similar talks were also given at Yahoo!, Microsoft, and Motorola Labs).
This work is in the general area of opinion extraction or opinion mining,
and feature-based opinion summarization from the user-generated content or user-generated media on the Web, e.g., reviews, forum and group discussions, and blogs. The area is also known as sentiment analysis, and is closely
related to sentiment classification. Our current work is in two areas:
Opinion extraction or mining. Ex: what are positive and negative user opinions on Gmail?
Comparison extraction or mining. Ex: is Gmail better than Yahoo! mail?
1. Opinion extraction or mining from customer reviews
It is a common practice that merchants selling products on the Web ask their customers to review the products and associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make a decision whether to buy the product. It also makes it difficult for the manufacturer of the product to keep track and manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some sentences of the original sentences from the reviews to capture the main points as in the classic text summarization. For researchers, we always want to have an abstraction of the problem. Here it is.
Abstraction of the problem: Feature-based opinion summary of multiple reviews (KDD-04 and WWW-05)
mining product features that have been commented on by customers;
identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative (sentiment analysis);
summarizing the results.
We have proposed several techniques to perform these tasks.
2. Comparative sentence and relation extraction
A comparative sentence expresses an ordering relation between two sets
of entities with respect to some common features. For example, the
comparative sentence "Canon's optics are better than those of Sony and
Nikon". expresses the comparative relation: (better, {optics}, {Canon},
{Sony, Nikon}). Comparative sentences use different language constructs
from typical opinion sentences (e.g., "Cannon's optic is great").
Abstraction of the problem: Extraction of comparative relations, i.e., who is better than who on what (AAAI-06)
identify comparative sentences from texts, e.g., reviews, forum or blog postings, and news articles.
extract comparative relations from the identified comparative sentences.
This problem has many applications. For example, a product manufacturer may want to know customer opinions of its products in comparison with those of its competitors.
If you are interested in the comparative sentence dataset associated with papers 1 and 2 below, please drop us an email.
Publications
Nitin Jindal and Bing Liu. "Identifying Comparative Sentences in Text Documents" Proceedings of the 29th Annual International ACM SIGIR Conference on Research & Development on Information Retrieval (SIGIR-06), Seattle 2006. [PDF]
Nitin Jindal and Bing Liu. "Mining Comprative Sentences and Relations." Proceedings of 21st National Conference on Artificial Intellgience (AAAI-2006), July 16.20, 2006, Boston, Massachusetts, USA. [PDF]
Bing Liu, Minqing Hu and Junsheng Cheng. "Opinion Observer: Analyzing and Comparing Opinions on the Web" Proceedings of the 14th international World Wide Web conference (WWW-2005), May 10-14, 2005, in Chiba, Japan. [PDF]
Minqing Hu and Bing Liu. "Mining and summarizing customer reviews".
Proceedings of the ACM SIGKDD International Conference on
Knowledge Discovery & Data Mining (KDD-2004, full paper), Seattle,
Washington, USA, Aug 22-25, 2004. [PDF]
Minqing Hu and Bing Liu. "Mining Opinion Features in Customer
Reviews." Proceedings of Nineteeth National Conference on Artificial Intellgience (AAAI-2004), San Jose, USA, July 2004. [PDF]
Created on May 15, 2004 by Bing Liu; and Minqing Hu.