This is the 9th workshop on DMKD held annually in conjunction with ACM SIGMOD conference. The workshop aims to bring together data-mining researchers and practitioners with the goal of discussing the next generation of data-mining algorithms and tools. Rather than following a "mini-conference" format focusing on the presentation of polished research results, the DMKD workshop will foster an informal atmosphere, where researchers and practitioners can freely interact through short presentations and open discussions on their ongoing work as well as forward-looking research visions/experiences for future data-mining applications.
In addition to research on novel data-mining algorithms and experiences with innovative mining algorithms and applications, of particular interest in this year's DMKD workshop is the theme of "Data Mining and Information Integration".
Today, organizations are plagued with accessing and integrating disparate data. Very few organizations find that all of the information they need is readily available. Information is typically scattered throughout the organization in many locations, data stores, and formats. Data integration has become an urgent business problem. Extracting and integrating information from the Web is another area of great importance. With a huge amount of information publicly available, the Web offers an unprecedented opportunity for organizations to identify and extract useful information from diverse Web sources to provide value added services, to integrate Web information with their own data, and to discover business intelligence information about their competitors. Information integration is also critical for science, engineering and healthcare. To maximally automate the integration process, data mining and machine learning provides a key technology for discovering patterns or regularities in order to match database schemas, to clean the data and to identify, extract and combine data/information from diverse sources.
Topics of interest include (but are not limited to):
New data mining algorithms for data/information integration: Data/information integration not only offers an excellent application area for data mining but also provides new problems for data mining research. Reports of such problems and their algorithms are of great interest to both researchers and practitioners.
Data cleaning: Cleaning the data is a critical step in almost every data integration application. It is perhaps also the most time consuming step with significant manual effort. Innovative cleaning techniques and tools are thus needed to automate this process as much as possible.
Web information integration and mining: Although one can find almost anything on the Web, identification and extraction of precise information that one is interested in and integration of such information from multiple Web sites/pages are still major challenges. Data mining and machine learning has traditionally been one of the key technologies for these tasks. Reports of novel ideas, techniques and problems are strongly encouraged.
Integrated mining from data, text, image and other data forms: Mining from traditional data has been studied extensively. Integrated mining of different types of data, however, has received little attention. In many application domains, relevant information/data may be of multiple forms. Mining in such an environment is increasingly important.
Privacy and security: In mining and integrating data from multiple sources, there are many privacy and security issues. Innovative ideas and techniques are called for to make such mining and integration practical or even possible.
Future trends/past reflections: What are the emerging topics or applications of data mining and information integration? What lessons have been learned from past research and applications? What areas need more research?
Submitted papers should not exceed 10 pages, single-spaced, single column, 12 point font, including all figures, tables, and references. The workshop accepts only electronic submission of papers in PDF format. Please email your paper
Accepted papers will be included in ACM Digital Library. The final version
of each paper must follow the ACM format guideline. The authors also
need to sign the ACM copyright form. The format guideline and the copyright
form will be sent to authors of accepted papers.
Submission deadline: March 29, 2004
Notification: April 26, 2004
Camera-ready due: May 14, 2004
Workshop: June 13, 2004
Charu Aggarwal, IBM T. J Watson Research Center
Venkatesh Ganti, Microsoft research
Johannes Gehrke, Cornell University
Robert Grossman, University of Illinois at Chicago
Dimitris Gunopulos, UC Riverside
Jiawei Han, University of Illinois at Urbana-Champaign
Vipin Kumar, University of Minnesota
Huan Liu, Arizona State University
Jian Pei, SUNY Buffalo
Raghu Ramakrishnan, University of Wisconsin-Madison
Kyuseok Shim, Seoul National University
Jaideep Srivastava, University of Minnesota
Ankur M. Teredesai, Rochester Institute of Technology
Alex Tuzhilin, New York University
Jamshid Vayghan, IBM T. J. Watson Research Center
Ke Wang, Simon Fraser University
Mohammed J. Zaki, Rensselaer Polytechnic Institute
Zhongfei Zhang, Binghamton University