EECS 584 --- Artificial Intelligence 2
Instructor: Barbara Di Eugenio
Time: MW 6:30-7:45 pm
Location: LH 301
(The full syllabus (in postscript) can be found
here. Note that
the syllabus is periodically updated)
Natural Language Processing
We will adopt the most current approach to Artificial Intelligence, that
sees it as the study of intelligent agents. We will explore the
application of AI techniques to information technology and the
Web. E.g., intelligent agents are used to search the web for useful
information, as a foundation for computer supported collaborative work
or for electronic commerce.
After briefly reviewing some foundations of AI such as search algorithms, we will
spend the rest of the semester on some of the most exciting research
areas these days, such as:
We will explore each topic by starting with the relevant chapters in
one of the required books ([Mitchell 97] and [Jurafski and Martin
00]), that provides the foundations for building intelligent
agents. We will supplement the books with other readings. We will
experiment with the techniques we will study and apply them to
problems of interest to the students.
Tom Mitchell. Machine Learning, McGraw Hill, 1997.
Daniel Jurafsky and James Martin. Speech and Language
Processing. Prentice Hall, 2000.
Stuart Russell & Peter Norvig. Artificial Intelligence, a Modern
Approach. Prentice Hall, 1995.
James Allen. Natural Language Understanding --- 2nd Edition. The Benjamin/Cummings Publishing Company. 1995.
Natural Language Processing for the World Wide Web. AAAI Spring Symposium. 1997
Survey of the State of the Art in Human Language Technology. 1995.
EECS484 or consent of instructor. Knowledge of
LISP and/or PROLOG and/or Java.
Machine Learning pointers and papers
General resource pages for Machine Learning:
After covering the basics of ML from Mitchell, we will
the following applications of ML to WWW and Internet:
- Learning User's interests regarding:
- web sites: WebWatcher
from Carnegie Mellon
University (Joachims et al. 97), and various
developed by the Machine Learning
group at the University of
California at Irvine (Pazzani and Billsus 97; Ackerman et
- books and documents: The
Recommender system, University of Texas at Austin
(Mooney and Roy 00)
- Extracting Knowledge from the web (i.e., classifying web pages): WebKb from Carnegie Mellon
University (Craven et al. 97).
- Text mining from text: The
DiscoTEX system, University of Texas at Austin
(Nahm and Mooney 00)
Natural Language Processing pointers and papers
The Association for
Computational Linguistics (ACL)
Papers on CL/NLP are
archived within the
Computing Research Repository, with subject Computation and
A registry of some
NLP software can be found at the DFKI. Some
resources can be found in the
NLP section of the CMU AI repository.
Planning pointers and papers
A few pointers to
implemented planners can be found here. (This page appears
to be slightly out of date).
Homework 1 (due ...)
The program for ID3 (courtesy of Tom Mitchell, Carnegie Mellon
University) is written in Common Lisp. We are using the GNU
implementation of Common Lisp, GCL, available on the EECS servers at
/usr/local/bin/gcl. You can use another Common Lisp, e.g. on your PC,
if you so wish.
You need the following three files:
There's also a sample load file,
load.lsp, that you'll have to modify
accordingly (paths etc).
decision-tree.lsp: contains the code
699 data points on breast cancer
detection, courtesy of University of Wisconsin Hospitals, via the UCI
data repository (the data has been reformatted to be consistent with
the ID3 implementation we use)
describes the meaning of attributes (plus specifies attribution for
Homework 2 (due ...)