Fall 2017: CS 553 (Call no 39828) Distributed Computing Systems
Instructor:
Ajay Kshemkalyani
Email:
ajay@uic.edu
Class meeting times: T and R 2:00-3:15pm
Room: LC A5
Office Hours in 915 SEO:
T 5:00 - 6:00pm, and by appointment
- Models: synchronous/asynchronous; shared memory/message-passing
- Global states and snapshots; time models and clock synchronization
- Distributed graph algorithms
- Group communication (including multicast), managing group views
- State predicate detection
- Reasoning with knowledge
- Distributed mutual exclusion, distributed deadlock detection, leader election, termination detection
- Distributed shared memory - coherence, models, register constructions, atomic snapshots (applications to multicore architectures)
- Checkpointing, rollback recovery; distributed debugging
- Agreement and consensus (with malicious and non-malicious process behavior)
- Failure detectors
- Self-stabilizing systems
- Peer-to-peer systems, e.g., Chord, Tapestry, Content-Addressible Network, BitTorrent
- other current topics, e.g., sensor networks
See detailed table of contents of the textbook below by going to the link at Cambridge University Press or Amazon.
This course focuses on distributed algorithms and theoretical principles.
By the end of the course,
you will be able to appreciate that these algorithms have wide applications
in practical distributed systems.
- Textbook:
Distributed Computing: Principles, Algorithms, and Systems, by
Kshemkalyani and Singhal, Cambridge University Press, March 2011 edition
South Asian reprint edition, Dec 2010
ISBN-10: 1107648904, ISBN-13: 978-1107648906
- Course notes are here
- An Overview Chart
- Winners of the Dijkstra Award for Most Influential Paper in Distributed Computing, 2000-.
-
Selected papers and other material from the literature will be posted on the
web or distributed in class.
- Suggested Topics and Papers for class presentation (tentative list)
- Chapter 13 (Checkpointing)
- Self-Stabilization:
E.W. Dijkstra,
Self-stabilizing systems in spite of distributed control,
Communications of the ACM, vol. 17, no. 11, pp. 643-644, Nov. 1974.
plus Chapter 17 (Self-stabilization)
- Failure Detectors:
M. Raynal, A Short Introduction to Failure Detectors for Asynchronous Distributed Systems,
ACM SIGACT News Distributed Computing Column 17.
Vol 36(1), (134, March 2005).
plus Chapter 15 (Failure Detectors)
- Paxos:
L. Lamport,
Paxos Made Simple,
ACM SIGACT News (Distributed Computing Column) 32, 4 (121, December 2001)
51-58.
W. Bolosky et al, Paxos replicated state machines as the basis of a high-performance data store, NSDI 2011
- Hadoop and Mapreduce:
J. Dean and S. Ghemawat,
MapReduce: Simplified Data Processing on Large Clusters,
OSDI'04: Sixth Symposium on Operating System Design and Implementation,
San Francisco, CA, December, 2004.
plus Hadoop
- Cloud Computing
- Snapshot Isolation for distributed databases
- Transactional Memory - software and hardware
- Concurrent Data Structures
- Conflict-free Replicated Data Types (CRDTs)
The course format will be in two parts.
-
For the first part, the instructor will teach.
Attendance when the instructor is
teaching is not compulsory, but you must attend all the tests/exams.
However, if you miss class, it is your responsibility
to find out what was announced and what was covered, from other students.
-
The second part involves active student participation and is planned as follows.
Each presentation will be made by a team of 3 or 4 students,
depending on the final enrollment which will be known only in the third
week of class.
The topics will be assigned at least 3 weeks before its presentation date.
The class presentations will be on an assortment of topics of current interest.
Each group chooses a paper/topic from a list of topics provided around the
5th week of class.
This is only a starting point. Once you select a topic
from the list (to be provided), you may have to identify more basic or
fundamental papers on that specific topic for presentation.
Pick the most basic/ fundamantal papers that are
rich in new ideas.
They must also have algorithmic content.
Attendance when the student presentations are going on is compulsory.
There is also a term paper requirement.
-
Each student must also write a term paper, in his/her own words.
The topic can be anything related to distributed computing.
You can format it according to IEEE or ACM style. Templates are available
from the respective web sites.
This term paper should contain new ideas, to the maximum extent possible.
A clearly marked section(s) should present your original ideas.
You need to do a literature search for the Related Works section, but write
in your own words. Do NOT copy-paste from the internet or other sources.
Algorithm analysis and design (cs401); operating systems (cs385); networks (cs450); or permission of the instructor.
The following is only a tentative breakup of the evaluation scheme
and will be finalized after the second week of class, depending on the final enrollment
in the course.
- midterm 1 (30%) + final (40%)
- Class presentation (15%)
- Term paper (15%)
The final grade is on the curve, i.e., this is relative grading - how you
perform with respect to the others in the class.
Tentative course progress chart (will be updated as we progress)
- Week 1: Introduction (Chapter 1)
- Week 2: Chapter 1, Elementary distributed algorithms (Chapter 5)
- Week 3: Chapter 5
- Week 4: Chapter 5, Chapter 2 (Models), Chapter 3 (Time)
- Week 5: Chapter 4 (Global state), Chapter 6 (Message Ordering and Group Comm.)
- Week 6: Chapter 6
- Week 7: Chapter 14 (Consensus and Agreement)
- Week 8: Chapter 14 (Consensus and Agreement), Chapter 8 (Reasoning with Knowledge)
- Week 9: Chapter 8 (Reasoning with Knowledge), Chapter 18 (Peer-to-peer computing)
- Week 10:
Midterm on Oct 31: Syllabus: Chapters 1, 5, 2, 3.1-3.4, 3.5.1, 3.7, 3.9, 4-4.3, 6.1-6.2, 6.4-6.5.1, 6.6-6.7, 6.9-6.11.2, 8.1-8.4, 14.1-14.5.4, 14.6-14.6.4, 14.6.6
- Nov 2: Checkpointing: Pavan, Rohit, Dan
- Week 11:
- Nov 7: Self-stabilization: Akhauri, Urviben, Kavyath
- Nov 9: Failure Detectors: Aakash, Christopher, Tinesh
- Week 12:
- Nov 14: Paxos: Pranali, Rashmi, Sai
- Nov 16: Hadoop and Mapreduce: Laura, Massimo, Lorenzo
- Week 13:
- Nov 21: Transactional Memory - software and hardware: Andrea, Maram
- Nov 23: Thanksgiving
- Week 14:
- Nov 28: Cloud Computing: Dhananjay, Nishali, Yogesh
- Nov 30: Snapshot Isolation Methods: Srujan, Bhargav, Pranay
- Week 15:
- Dec 5: Concurrent data structures: Anirudh, Amogh, Vishwas
- Dec 7: Conflict-free Replicated Data Types (CRDTs): John, Mohammad, Sepideh
- Dec 8 (Fri) 12 noon: Term paper due in 915 SEO (slide hard copy under the door)
- Closed book Final exam: 3:30pm-5:30pm, Mon, Dec 11 in BH 208.
Syllabus: Midterm syllabus + Chapter 18.1-18.6
Anonymized Final Grades