FeatureFinder Download and Help

Opinion mining has been an active research area in recent years. The basic task is to extract people's opinion on the features of an entity. For example, the sentence, "I love the GPS function of Motorola Droid" expresses a positive opinion on the "GPS function" of the Motorola phone.   "GPS function" is the feature.

This program is to extract features(aspects) from product reviews. The basic idea is based on the following paper.

"Extracting and Ranking Product Features in Opinion Documents", In Proceeding of the 23rd International Conference on Computational Linguistics (COLING 2010)


1. Executable (.jar)

We only provide executable (*.jar) version of the system (without source). The program is free for scientific use. If you want source code or use the software for commercial purposes. Please contact me.


2. Download and install

    1). Download the program here

    2). Unzip files to a directory.  There are 2 files and 2 directories. "FeatureFinder.jar"  is the executable file. "sample_car.txt" is the sample file. "lib" and "config" are two system directories.

    3). You have to install WordNet 2.0 before using the program. The install directory should be "C:\WordNet\2.0" (or you can change path setting of "file_properties.xml" under "config" directory).


3. How to use

Open a DOS Window (Command Prompt) from your PC and go to the program directory. You can run the system from there. You can use the following command to run:

 java -jar FeatureFinder.jar -i inputName -o outputName [-Option value]  -t topic word  

  -i : represent input file.

 -o: represent output file. (the user has to specify output file name)

 -t: this parameter is optional. It represents a topic word for the review. (e.g. the topic word for a car review is the word "car"). If the user does not provide word for this parameter, the system can find it automatically.


e.g.  " java -jar FeatureFinder.jar -i sample_car.txt -o features.txt -t car"


4. Input format.

The input format for the input file should be as follows.

  "Lower JJR the DT cost NN of IN the DT navigation NN system NN . ."

e.g. The word "cost" is followed by its corresponding Part of Speech (POS) tagging "NN".  


5. Output format.

The output format for the output file is as follows.



handling 35

mpg 81

cost , costs 30

engine 57



The extracted features are ranked by importance. (please refer to the paper for details). The plural form and basic form are grouped together (e.g. "cost" and "costs") and the number represents word frequency.   



Last updated   December 30, 2010