CS 418 Introduction to Data Science
University of Illinois at Chicago, Spring 2020

Lecture time: MW 4:30-5:45pm

Location: TBH 180F

Instructor: Prof. Elena Zheleva

Office hours: Tue 3-5pm, SEO 1140


Graduate TA: Shishir Adhikari

Office hours: TBD

Contact: sadhik9@uic.edu

"Our ability to collect, manipulate, analyze, and act on vast amounts of data is having a profound impact on all aspects of society. This transformation has led to the emergence of data science as a new discipline. The explosive growth of interest in this area has been driven by research in social, natural, and physical sciences with access to data at an unprecedented scale and variety, by industry assembling huge amounts of operational and behavioral information to create new services and sources of revenue, and by government, social services and non-profits leveraging data for social good. This emerging discipline relies on a novel mix of mathematical and statistical modeling, computational thinking and methods, data representation and management, and domain expertise."
--Committee on Data Science, Computing Research Association

Course description

This course provides an in-depth overview of data science from a computer science perspective. Topics include modeling, storage, manipulation, integration, classification, analysis, visualization, information extraction, and big data. The course is programming-intensive and an emphasis will be placed on tying data science concepts to specific real-world applications through hands-on experience.


Working knowledge of probability, data structures and algorithms, and ability to (learn to) program in Python.

Course materials

We will use Piazza for the course schedule, discussions, and materials, and Gradescope for grading.
Python is the programming language used for homework assignments.

Student deliverables

Programming-based homework assignments - 30%
Midterm exam - 20%
Bi-weekly quizzes - 15%
Class project - 35%


No textbook is required. Readings will be assigned, using multiple online sources, including:
[PTDS] Principles and techniques of data science. Lau, Gonzalez, Nolan.
[MMD] Mining of massive datatasets. Leskovec, Rajaraman, Ullman.
[FDV] Fundamentals of data visualization. Wilke.
[CIT] Computational and Inferential thinking. Adhikari, DeNero.
[CIML] A course in machine learning [Errata]. Hal Daume III.