Next Generation Data Mining and Social Computing (NGDS) LAB

The lab focuses on next generation data management and mining issues with special attention to social computing.

  • As the data size continues to grow at an exponential rate, we need to develop scalable algorithms to manage and mine petascale data.
  • Social, natural, and information systems usually consist of a large number of interacting components. Examples of such systems include communication and computer systems, the Internet, biological networks, transportation systems, epidemic networks, criminal rings, and hidden terrorist networks. All the above systems share an important common feature: they are networked systems, i.e., individual agents or components interact with a specific set of components, forming large, interconnected, and heterogeneous networks. Without loss of generality, we call such interconnected networks or systems as information networks. Clearly, information networks are ubiquitous and form a critical component of modern information infrastructure. Hidden in these networks are the answers to important questions.
  • Traditional data mining focuses on record oriented data. However, many new applications are with graph oriented data, where the linkage relationships among the entities need to be captured, analyzed and mined.
  • Data stream technology offers an alternative paradigm to perform continuous real-time filtering, analysis and mining of high volume data that continuously being generated and collected.
  • Privacy preserving data publishing is critically needed for sharing of data.
  • Clouding computing platform offers a new computing platform to manage and mine data.

Some of the main research projects are:

  • Graph and Link mining
  • OLAP on information network
  • Data stream mining
  • Privacy preserving data publishing.
  • Social computing.
  • Transfer learning
  • Mining under Cloud computing.
  • Domain specific mining.

Recent Books:

1. H. Kargupta, J. Han, P.S. Yu, and R. Motwani, "Next Generation of Data Mining", Chapman & Hall, 2009.

2. J. Tsai, and P.S. Yu, "Machine Learning in Cyber Trust: Security, Privacy, and Reliability", Springer, 2009.

3. C. Aggarwal, and P.S. Yu, "Privacy-Preserving Data Mining: Models and Algorithms", (Advances in Database Systems), Springer, 2008.

4. L. Cao, P.S. Yu, C. Zhang, and H. Zhang, "Data Mining for Business Applications", Springer, 2008.

5. O. Nasraoui, O. Zaiane, M. Spiliopoulou, B. Mobasher, B. Masand, and P.S. Yu, "Advances in Web Mining and Web Usage Analysis", Lecture Notes in Computer Science LNCS 4198, Springer, 2006.

Funding:
The projects of the NGDS Lab are funded by NSF grants.

 
Copyright 2009 The Board of Trustees of the University of Illinois