CS582 - Information Retrieval

Textbook

Principles of Query Processing for Advanced Database Applictions. By C. Yu and W. Meng, Morgan Kaufmann, 1998

Description

This course covers topics such as techniques of retrieval of text, picture, and video.

Lecture Notes

Term Project

Form Information Extraction System - A lot of web sites use the interface consisting of an HTML form to let users submit their queries. It will be helpful for us to perform automatic information collection if we can identify or recognize each field of these HTML forms and feed the user's query into the corresponding input field. However, the fact that different web sites might use HTML forms with different layouts and compositions because of the unstructured nature and flexibility of HTML syntax makes the identification process not easy. We proposed a solution to the form identifying by utilizing the structural and semantic information in this project, and based on this solution we construct the architecture with a user feedback mechanism to achieve our identification goal. (final report)