Lecture time: TTh 12:30-1:45pm
Location: First two weeks online, SES 201
Instructor: Prof. Elena Zheleva
Office hours: TBD
Contact:
Graduate TA: Shishir Adhikari
Office hours: TBD
Graduate TA 2: Zahra Fatemi
Office hours: TBD
"Our ability to collect, manipulate, analyze, and act on vast amounts of data is having a profound impact on all aspects of society. This transformation has led to the emergence of data science as a new discipline. The explosive growth of interest in this area has been driven by research in social, natural, and physical sciences with access to data at an unprecedented scale and variety, by industry assembling huge amounts of operational and behavioral information to create new services and sources of revenue, and by government, social services and non-profits leveraging data for social good. This emerging discipline relies on a novel mix of mathematical and statistical modeling, computational thinking and methods, data representation and management, and domain expertise."
--Committee on Data Science, Computing Research Association
This course provides an in-depth overview of data science from a computer science perspective. Topics include modeling, storage, manipulation, integration, classification, analysis, visualization, information extraction, and big data. The course is programming-intensive and an emphasis will be placed on tying data science concepts to specific real-world applications through hands-on experience.
Working knowledge of probability, data structures and algorithms, and ability to (learn to) program in Python.
We will use Piazza for the course schedule, discussions, and materials, and Gradescope for grading.
Python is the programming language used for homework assignments.
No textbook is required. Readings will be assigned, using multiple online sources, including:
[PTDS] Principles and techniques of data science. Lau, Gonzalez, Nolan.
[MMD] Mining of massive datatasets. Leskovec, Rajaraman, Ullman.
[FDV] Fundamentals of data visualization. Wilke.
[CIT] Computational and Inferential thinking. Adhikari, DeNero.
[CIML] A course in machine learning [Errata]. Hal Daume III.