CS 594: Beyond the Dark Side of Data

Course Description

The advent of data-mining techniques and the release of large datasets brings to the forth two, somewhat opposing issues: What are the potential implications of this data for security purposes, and how can we rectify the release and use of this data with norms of privacy and ethics. This class investigates cases of large-scale internet research alongside its technical and ethical challenges, and how can they be overcome. Students taking this course will learn state of the art methods used for Internet scale security research including data collection and analysis, methods and approaches used to break widely deployed Internet systems, and present new studies of their own design. While examining each of these scientific approaches, we will also consider the ethical and legal frameworks that researchers use to ensure that their methods and results are a worthwhile contribution to the wider security research community.

Method of Instruction

Socratic style with 3-5 conference/journal paper readings per class for discussion. Instructors will prepare questions for each paper and iterate through class list, ensuring that each student performs a close critical reading of the assigned papers. Class will be led by Chris Kanich and Lenore Zuck, with guest lectures by researchers in the field whenever possible. If necessary, initial class periods (1-3 weeks) will focus on core technical topics based on the skill level of the students in the class, as determined by the pre-class survey.

Goal

Bring students up to speed on state of the art in research regarding Internet-scale problems in security and privacy. Students will also conduct research at the class project level related to one of the three themes of the course.

Themes

Empirically understanding security on the Internet. This section will include large measurement studies of the Internet with a focus on security related research. This will include studies of botnet infiltration, social network privacy, global vulnerability scans, and backscatter analysis.

Breaking Internet-wide assumptions of security and privacy. This section will cover research which breaks widely deployed protocols or systems, how they were broken, and the impact of these revelations on future research. This section will include studies of encrypted VOIP analysis, the Netflix de-anonymization, and breakages of widely deployed protocols like WEP, TLS, and others.

Building a secure, private Internet. This section will cover constructive approaches to fixing the shortcomings discovered in the previous section, and other systemic security issues on the Internet. These approaches include methods like secure computation and differential privacy, as well as deployed protocols like secure BGP and DNSSEC.

Student Deliverables

Beyond daily readings, students will be assigned a short measurement project which will function as a midterm evaluation, and a research level group project as a final evaluation.

Prerequisites

Undergrad level computer science-based security course (either here or at their previous institution). Undergrads only admitted with prior classwork and consent of the instructor. Students from outside of the computer science program must complete the preliminary skills inventory below.