KDSIN 2016: Knowledge Discovery in Social and Information Networks

Instructor: Cornelia Caragea     Summer 2016

Schedule and Class Notes

DateLectureDescription and Reading MaterialNB
06/09/2016Introduction to Social and Information Networks [slides]
Overview of Machine Learning [slides]
  • Overview of Social Networks
  • Why Knowledge Discovery in Social Networks
  • Course Logistics [slides]
  • Recommended Reading:
    • Easley & Kleinberg: Chapter 1
    • "A Few Useful Things to Know about Machine Learning" by Pedro Domingos. [pdf]
-
06/10/2016Network Structure and Properties [slides]
  • The Structure of a Network
  • Paths and Connectivity in Graphs
  • The Small World Phenomenon
  • Degree Distribution
  • Clustering Coefficient
  • Reading:
    • Easley & Kleinberg: Chapter 2
-
06/13/2016Strong and Weak Ties [slides]
  • Triadic Closure
  • The Strength of Weak Ties
  • Closure and Structural Holes
  • Community Detection
  • Recommended Reading:
    • Easley & Kleinberg: Chapter 3
-
06/14/2016Positive and Negative Relationships [slides]
Web Search and Information Retrieval [slides]
  • Structural Balance Property
  • The Structure of Balanced Networks
  • Weakly Balanced Networks
  • Suggested Project Topics
  • Reading:
    • Easley & Kleinberg: Chapter 5
-
06/15/2016Link Analysis and Web Search [slides]
Network Visualization With Gephi [slides]
  • Link Analysis using Hubs and Authorities
  • The HITS Algorithm
  • The PageRank Algorithm
  • The Dinining Dataset
  • Reading:
    • Easley & Kleinberg: Chapter 13
-
06/16/2016Keyphrase Extraction in Citation Networks:
How do Citation Contexts Help? [slides]
  • Keyphrase Extraction in Document Networks
  • Reading:
    • Cornelia Caragea, Florin Bulgarov, Andreea Godea, and Sujatha Das Gollapalli. "Citation-Enhanced Keyphrase Extraction from Research Papers: A Supervised Approach." (Using citation contexts in a supervised approach to improve keyphrase extraction.) In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), Doha, Qatar, 2014. [abstract] [pdf] [link to project website]

    • Sujatha Das Gollapalli and Cornelia Caragea. "Extracting Keyphrases from Research Papers using Citation Networks." (Using citation contexts in an unsupervised approach to improve keyphrase extraction.) In: Proceedings of the 28th American Association for Artificial Intelligence (AAAI 2014), Quebec City, Quebec, Canada, 2014. [abstract] [pdf]

    • Xiaojun Wan and Jianguo Xiao. "Single Document Keyphrase Extraction Using Neighborhood Knowledge." In: Proceedings of the 23rd American Association for Artificial Intelligence 2008 (AAAI 2008). [pdf]

    • Rada Mihalcea and Paul Tarau. "TextRank: Bringing Order into Texts." In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, July 2004. (EMNLP 2004). [pdf]

-
06/17/2016Naive Bayes [slides]
  • Naive Bayes - Multivariate Bernoulli Model and Multinomial Model
  • Reading:
    • Manning: Chapter 13
  • Recommended Reading:
    • A. McCallum and K. Nigman (1998). "A comparison of Event Models for Naive Bayes Text Classification." In: AAAI/ICML’98. Workshop on Learning for Text Categorization, AAAI Press. [pdf]
    • J. Provost (1999). "Naive-Bayes vs. Rule-Learning in Classification of Email." University of Texas at Austin, Artificial Intelligence Lab. Technical Report AI-TR-99-284. [pdf]
-
06/20/2016Practical Issues in Machine Learning [slides]
  • Model Evaluation
  • Performance Measures
  • Project Proposal Presentations
-
06/21/2016Sentiment Analysis in Disaster Events [slides]
Linear Regression [slides]
  • Sentiment Classification of Tweets from the Sandy Hurricane
  • Recommended Reading:
    • Cornelia Caragea, Anna Squicciarini, Sam Stehle, Kishore Neppalli, Andrea H. Tapia. "Mapping Moods: Geo-Mapped Sentiment Analysis During Hurricane Sandy." In: Proceedings of the 11th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2014), University Park, Pennsylvania, USA, 2014. [pdf] [link to project website]
  • Linear Regression with One Variable
-
06/22/2016Linear Regression [slides]
Logistic Regression [slides]
  • Linear Regression with Multiple Variables
  • Logistic Regression
  • Weka Lab [slides]
-
06/23/2016Semi-supervised Learning [slides]
Support Vector Machine [slides]
  • Incorporating Unlabeled Data with EM
  • Self-training
  • Co-training
  • Co-Training for Topic Classification of Scholarly Data
  • Support Vector Machine
  • Recommended Reading:
    • Cornelia Caragea, Florin Bulgarov, and Rada Mihalcea. "Co-Training for Topic Classification of Scholarly Data." In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, 2015. [pdf] [slides]

-
06/24/2016Neural Networks [slides]
  • Neural Networks
  • Concluding Remarks
-