KDSIN 2016

KDSIN 2016: Knowledge Discovery in Social and Information Networks

Instructor: Cornelia Caragea Summer 2016

Schedule and Class Notes

Date	Lecture	Description and Reading Material	NB
06/09/2016	Introduction to Social and Information Networks [slides] Overview of Machine Learning [slides]	Overview of Social Networks Why Knowledge Discovery in Social Networks Course Logistics [slides] Recommended Reading: Easley & Kleinberg: Chapter 1 "A Few Useful Things to Know about Machine Learning" by Pedro Domingos. [pdf]	-
06/10/2016	Network Structure and Properties [slides]	The Structure of a Network Paths and Connectivity in Graphs The Small World Phenomenon Degree Distribution Clustering Coefficient Reading: Easley & Kleinberg: Chapter 2	-
06/13/2016	Strong and Weak Ties [slides]	Triadic Closure The Strength of Weak Ties Closure and Structural Holes Community Detection Recommended Reading: Easley & Kleinberg: Chapter 3	-
06/14/2016	Positive and Negative Relationships [slides] Web Search and Information Retrieval [slides]	Structural Balance Property The Structure of Balanced Networks Weakly Balanced Networks Suggested Project Topics Reading: Easley & Kleinberg: Chapter 5	-
06/15/2016	Link Analysis and Web Search [slides] Network Visualization With Gephi [slides]	Link Analysis using Hubs and Authorities The HITS Algorithm The PageRank Algorithm The Dinining Dataset Reading: Easley & Kleinberg: Chapter 13	-
06/16/2016	Keyphrase Extraction in Citation Networks: How do Citation Contexts Help? [slides]	Keyphrase Extraction in Document Networks Reading: Cornelia Caragea, Florin Bulgarov, Andreea Godea, and Sujatha Das Gollapalli. "Citation-Enhanced Keyphrase Extraction from Research Papers: A Supervised Approach." (Using citation contexts in a supervised approach to improve keyphrase extraction.) In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), Doha, Qatar, 2014. [abstract] [pdf] [link to project website] Sujatha Das Gollapalli and Cornelia Caragea. "Extracting Keyphrases from Research Papers using Citation Networks." (Using citation contexts in an unsupervised approach to improve keyphrase extraction.) In: Proceedings of the 28th American Association for Artificial Intelligence (AAAI 2014), Quebec City, Quebec, Canada, 2014. [abstract] [pdf] Xiaojun Wan and Jianguo Xiao. "Single Document Keyphrase Extraction Using Neighborhood Knowledge." In: Proceedings of the 23rd American Association for Artificial Intelligence 2008 (AAAI 2008). [pdf] Rada Mihalcea and Paul Tarau. "TextRank: Bringing Order into Texts." In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, July 2004. (EMNLP 2004). [pdf]	-
06/17/2016	Naive Bayes [slides]	Naive Bayes - Multivariate Bernoulli Model and Multinomial Model Reading: Manning: Chapter 13 Recommended Reading: A. McCallum and K. Nigman (1998). "A comparison of Event Models for Naive Bayes Text Classification." In: AAAI/ICML’98. Workshop on Learning for Text Categorization, AAAI Press. [pdf] J. Provost (1999). "Naive-Bayes vs. Rule-Learning in Classification of Email." University of Texas at Austin, Artificial Intelligence Lab. Technical Report AI-TR-99-284. [pdf]	-
06/20/2016	Practical Issues in Machine Learning [slides]	Model Evaluation Performance Measures Project Proposal Presentations	-
06/21/2016	Sentiment Analysis in Disaster Events [slides] Linear Regression [slides]	Sentiment Classification of Tweets from the Sandy Hurricane Recommended Reading: Cornelia Caragea, Anna Squicciarini, Sam Stehle, Kishore Neppalli, Andrea H. Tapia. "Mapping Moods: Geo-Mapped Sentiment Analysis During Hurricane Sandy." In: Proceedings of the 11th International Conference on Information Systems for Crisis Response and Management (ISCRAM 2014), University Park, Pennsylvania, USA, 2014. [pdf] [link to project website] Linear Regression with One Variable	-
06/22/2016	Linear Regression [slides] Logistic Regression [slides]	Linear Regression with Multiple Variables Logistic Regression Weka Lab [slides]	-
06/23/2016	Semi-supervised Learning [slides] Support Vector Machine [slides]	Incorporating Unlabeled Data with EM Self-training Co-training Co-Training for Topic Classification of Scholarly Data Support Vector Machine Recommended Reading: Cornelia Caragea, Florin Bulgarov, and Rada Mihalcea. "Co-Training for Topic Classification of Scholarly Data." In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, 2015. [pdf] [slides]	-
06/24/2016	Neural Networks [slides]	Neural Networks Concluding Remarks	-