FeatureFinder Download and Help
Opinion mining has been an active research area in recent years. The basic task is to extract people's opinion on the features of an entity. For example, the sentence, "I love the GPS function of Motorola Droid" expresses a positive opinion on the "GPS function" of the Motorola phone. "GPS function" is the feature.
This program is to extract features(aspects) from product reviews. The basic idea is based on the following paper.
1. Executable (.jar)
We only provide executable (*.jar) version of the system (without source). The program is free for scientific use. If you want source code or use the software for commercial purposes. Please contact me.
2. Download and install
1). Download the program here
2). Unzip files to a directory. There are 2 files and 2 directories. "FeatureFinder.jar" is the executable file. "sample_car.txt" is the sample file. "lib" and "config" are two system directories.
3). You have to install WordNet 2.0 before using the program. The install directory should be "C:\WordNet\2.0" (or you can change path setting of "file_properties.xml" under "config" directory).
3. How to use
java -jar FeatureFinder.jar -i inputName -o outputName [-Option value] -t topic word
-i : represent input file.
-o: represent output file. (the user has to specify output file name)
-t: this parameter is optional. It represents a topic word for the review. (e.g. the topic word for a car review is the word "car"). If the user does not provide word for this parameter, the system can find it automatically.
e.g. " java -jar FeatureFinder.jar -i sample_car.txt -o features.txt -t car"
4. Input format.
The input format for the input file should be as follows.
"Lower JJR the DT cost NN of IN the DT navigation NN system NN . ."
e.g. The word "cost" is followed by its corresponding Part of Speech (POS) tagging "NN".
5. Output format.
The output format for the output file is as follows.
cost , costs 30
The extracted features are ranked by importance. (please refer to the paper for details). The plural form and basic form are grouped together (e.g. "cost" and "costs") and the number represents word frequency.
Last updated December 30, 2010