My main area of research is Natural Language Processing (NLP), and its application to human-computer interaction, educational technology, and multimedia systems. My goal is to use NLP to support both education and instruction, and collaboration between human or artificial agents. The theoretical aspects of my research concern the linguistic analysis, and the knowledge representation and reasoning that support the understanding and generation of NL discourse and dialogue. All my research has its empirical foundations in both qualitative and quantitative corpus analysis, including data mining techniques.
Director: Dr. Barbara Di Eugenio
(What follows needs some updating ... (Oct 2007))
Research in Natural Language Processing (NLP) at UIC focuses on semantics, and discourse and dialogue processing. Our goal is to use NLP to support both education and instruction, and collaboration between human or artificial agents (for those readers who are not familiar with NLP, NLP studies the computational models that underlie the processing of human languages, and develops key technology that makes it possible for users to interact with a computer system using English, Italian or Japanese rather than a programming language).
Our group focuses on the computational modeling of extended text (discourse) and conversations between two or more agents (dialogue). The theoretical aspects of our research concern the linguistic analysis, and the knowledge representation and reasoning that support the understanding and generation of NL discourse and dialogue. The intended applications range from automatically producing instructional manuals (e.g., those that accompany any piece of equipment such as a stereo), to providing dialogue capabilities for Intelligent Tutoring Systems (ITSs), computer based tutors that can help students master a subject. The methodology we employ blends empirical and symbolic approaches, and consists of: data mining from text corpora; development of computational frameworks based on the information extracted from the corpus; and rigorous evaluation of the computational models via user studies.
A full list of publications is available here
Our major areas of interest right now are:
- Computational models of tutorial dialogue (supported by the Office of Naval Research). This work concerns building ITSs that can participate in a dialogue with their users. The project DIAG-NLP has been one of the first to show that a language interface to an ITS does engender more learning in students (please see the following two papers, AIED05, and ACL05). Currently, we are exploring what distinguishes expert from novice tutors, in order to model the more effective tutors in an interface to an ITS that tutors students on basic data structures and algorithms. Read more
- Lexical semantics and inductive logic programming to learn discourse relations and domain knowledge (supported by an NSF CAREER award). We employ a novel methodology that couples a corpus parsed to obtain rich semantic representations(HLT-NAACL03) and annotated with discourse relations to learn a first order model for discourse relations via inductive logic programming. The final goal is to (semi)automatically acquire domain knowledge about action verbs and rhetorical knowledge about how those actions are related in instructional discourse. Read more
- Modeling collaboration in human-human and computer-human dialogues (supported by NSF, Advanced Learning Technologies program). We are extending our model of commitment in dialogue, developed under the project Coconut project at the University of Pittsburgh (see IJHCS 00, [.ps.gz]). We are modeling peer interactions in learning, and developing a peer dialogue agent in the domain of basic data structures and algorithms. Read more
We are also active in other areas of research, including:
- Empirical methods in discourse: tagging, statistical corpus analysis, machine learning
For example, we study coefficients of intercoder reliability ( CL squib on Kappa ) and work on inferring dialogue acts via extensions to Latent Semantic Analysis (ACL04 paper) and discourse relations.
Current members of the group (Spring 06):
- Jisha Abubaker (MS student)
- Joel Booth (PhD student)
- Davide Fossati (graduate research assistant, PhD student)
- Cindy Kersey (graduate research assistant, PhD student)
- Xin Lu (graduate research assistant, PhD student)
- Rajen Subba (graduate research assistant, PhD student)
- Zhuli (Jack) Xie (graduate research assistant, PhD student)
A picture of some of us from May 05
Clockwise from Barbara: SuNam
Kim (alumna), Davide Fossati, Andrew Mueller (alumnus), Zhuli Xie, Xin Lu, Rajen Subba.
- Michael Glass Michael was a postdoctoral fellow from August 2000 to August 2002. He is now an Assistant Professor at Valparaiso University.
- Susan Haller, associate professor at SUNY Potsdam, collaborates with us on NLP for ITSs.
- Stellan Ohlsson, UIC (Psychology), collaborates with us on cognitive models of tutorial dialogues.
- Massimo Poesio, senior lecturer at the University of Essex, UK, works with us on theories of referential expressions (click on "Referential Expressions" above)
- Pam Jordan and Sandra Katz, research associates at the University of Pittsburgh, work with us on analysis and modelling of peer learning.
- SuNam Kim (moved to Univ. of Melbourne summer 05) worked on inductive logic programming for discourse relations
- Dan Yu (MS Summer 04) worked on DIAG-NLP
- Yijue Hou (MS Summer 03) worked on DIAG-NLP
- Riccardo Serafin (MS Summer 03) worked on extending LSA to recognize dialogue acts
- Vijaysenthil Veeriah (MS Summer 03)
- Tejaswini Pendyala (MS Summer 02) worked on applying LSA to one of our tutoring corpora
- Elena Terenzi (MS Summer 02) worked on coupling LCFLEX with VerbNet
- Mike Trolio (MS Dec. 00) worked on generating instructional dialogue
- Kai-Hua Xiang (MS Dec. 00) worked on Machine Learning for generating cue phrases