Fall 2021: CS 553 (Call no 39828) Distributed Computing Systems
Instructor:
Ajay Kshemkalyani
Email:
ajay@uic.edu
Class meeting times: TR 2:00-3:15pm (Tuesday and Thursday)
Room: TBH 180C
Office Hours in Blackboard Collaborate Ultra: T 5:50-6:20pm
Discussion board: Piazza
Online submissions: Gradescope
- Models: synchronous/asynchronous; shared memory/message-passing
- Global states and snapshots; time models and clock synchronization
- Distributed graph algorithms, e.g., spanning tree, shortest paths, MST, maximal independent set, leader election, compact routing tables, optimal object replication
- Group communication - total order and causal order
- Reasoning with knowledge
- Distributed shared memory - coherence, consistency models, register constructions, atomic snapshots (applications to multicore architectures)
- Checkpointing, rollback recovery; distributed debugging
- Agreement and consensus (with malicious and non-malicious process behavior) in message-passing and shared memory systems
- Self-stabilizing systems
- Peer-to-peer systems, e.g., Chord, Tapestry, Content-Addressible Network, BitTorrent
- topics/papers on systems, e.g., Spanner, Zookeeper, Dynamo, consistency in the cloud
See detailed table of contents of the textbook below by going to the link at Cambridge University Press or Amazon.
This course focuses on distributed algorithms and theoretical principles.
By the end of the course,
you will be able to appreciate that these algorithms have wide applications
in practical distributed systems.
- Textbook:
Distributed Computing: Principles, Algorithms, and Systems, by
Kshemkalyani and Singhal, Cambridge University Press, March 2011 edition
South Asian reprint edition, Dec 2010
- Course slides are here
- An Overview Chart
- Winners of the Dijkstra Award for Most Influential Paper in Distributed Computing, 2000-.
-
Selected papers and other material from the literature will be posted on the
web or distributed in class.
- Suggested Topics and Papers for class presentation (tentative list; to be updated)
- Byzantine Agreement Protocols:
Gabriel Bracha, Sam Toueg,
Asynchronous Consensus and Broadcast Protocols,
Journal of the ACM, 32(4): 824-840, 1985.
Gabriel Bracha,
Asynchronous Byzantine Agreement Protocols, Information and Computation, 75(2): 130-143, 1987.
D. Imbs, M. Raynal,
Trading t-resilience for efficiency in asynchronous Byzantine Reliable Broadcast, Parallel Processing Letters, 2016.
A. Auvolat et al.,
Byzantine-tolerant causal broadcast, Theoretical Computer Science, 885: 55-68, Sept 2021.
- Paxos and Raft:
L. Lamport,
Report,
Paxos Made Simple,
ACM SIGACT News (Distributed Computing Column) 32, 4 (121, December 2001)
51-58.
W. Bolosky et al, Paxos replicated state machines as the basis of a high-performance data store, NSDI 2011
D. Ongaro and J. Ousterhout, In Search of an Understandable Consensus Algorithm (Extended Version).
- Hadoop and Mapreduce:
J. Dean and S. Ghemawat,
MapReduce: Simplified Data Processing on Large Clusters,
OSDI'04: Sixth Symposium on Operating System Design and Implementation,
San Francisco, CA, December, 2004.
plus Hadoop
- Consistency in the Cloud:
W. Lloyd, M. Freedman, M. Kaminsky, Don't settle for eventual: scalable causal consistency for wide-area storage with COPS, SOSP 2011, and
P. Bernstein and S. Das, Rethinking Eventual Consistency, in Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data , ACM SIGMOD, 22 June 2013
C. Li et al, Making geo-replicated systems fast as possible, consistency when necessary, OSDI 2012,
W. Lloyd et al, Stronger semantics for low-latency geo-replicated storage, NSDI 2013
- Storage in the Cloud I:
G. DeCandia et al. Dynamo: amazon's highly available key-value store. In Proceedings of twentyfirst ACM SIGOPS symposium on Operating systems principles, SOSP'07, pages 205--220.
B. Calder et al, Windows Azure Storage: a highly available cloud storage service with strong consistency, SOSP 2011
E. B. Nightingale, J. Elson, J. Fan. O. Hofmann, J.Howell, Y. Suzue, Flat Datacenter Storage, OSDI 2012
- Storage in the Cloud II:
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, Bigtable: A Distributed Storage System for Structured Data, OSDI 2006 (Google)
and
J. C. Corbett, J. Dean, et al, Spanner: Google's Globally-Distributed Database, OSDI 2012
- Hunt, Konar, Junqueira, Reed, Zookeeper: Wait-free coordination for Internet-scale systems, USENIX Annual Technical Conference, 2010.
Additionally and optionally, you can present: A Abadi et al. TensorFlow: A System for Large-Scale machine Learning, OSDI 2016.
- Concurrent Data Structures
- Blockchain and bitcoin
- Consensus and cryptocurrencies:
R. Guerraoui et al., The Consensus Number of a Cryptocurrency, PODC 2019.
Georghiedis, Streit, Garg, Who Needs Consensus? A Distributed Monetary System between Rational Agents via Heresay
The course format will be in two parts.
-
For the first part, the instructor will teach.
Attendance when the instructor is
teaching is not compulsory, but you must attend all the tests/exams.
However, if you miss class, it is your responsibility
to find out what was announced and what was covered, from other students.
-
The second part involves active student participation and is planned as follows.
Each presentation will be made by a team of 1 or 2 students,
depending on the final enrollment which will be known only in the third
week of class.
The class presentations will be on an assortment of topics of current interest.
Each group chooses a paper/topic from a list of topics provided around the
5th week of class.
This is only a starting point. Once you select a topic
from the list (to be provided), you may have to identify more basic or
fundamental papers on that specific topic for presentation.
Pick the most basic/ fundamantal papers that are
rich in new ideas.
They must also have algorithmic content. You are also to submit a 1-page summary-cum-critique of each topic (presented by others) in your own words.
Attendance when the student presentations are going on is compulsory.
There is also a term paper requirement.
-
Each student must also write a term paper, in his/her own words.
The topic can be anything related to distributed computing.
You can format it according to IEEE or ACM style. Templates are available
from the respective web sites.
This term paper should contain new ideas, to the maximum extent possible.
A clearly marked section(s) should present your original ideas.
You need to do a literature search for the Related Works section, but write
in your own words. Do NOT copy-paste from the internet or other sources.
Algorithm analysis and design (cs401) is suggested (but not required); or permission of the instructor.
The following is only a tentative breakup of the evaluation scheme
and will be finalized after the second week of class, depending on the final enrollment
in the course.
- midterm (20%)
- Class presentation (10 presentation + 10 summary/review of all presentations = 20%)
- Term paper (30%)
- Final (30%)
The course grade is on the curve, i.e., this is relative grading - how you
perform with respect to the others in the class.
Tentative course progress chart (will be updated as we progress)
- Week 1: Introduction (Chapter 1)
- Week 2: Distributed graph algorithms (Chapter 5)
- Week 3: Chapter 5
- Week 4: Chapter 5, Time (Chapter 3)
- Week 5: Chapter 3, Global state (Chapter 4)
- Week 6: Message Ordering and Group Communication (Chapter 6), Reasoning with Knowledge (Chapter 8)
- Week 7: Chapter 8
- Week 8: Consensus and Agreement (Chapter 14)
- Week 9: Chapter 14, Distributed Shared Memory (Chapter 12)
Midterm on Friday Oct 22, 7:00-9:30pm in Gradescope. Syllabus: Ch. 1-6,8,14
- Week 10:
- Chapter 12, Peer-to-Peer Systems (Chapter 18)
- Oct 28: Byzantine Agreement Protocols, Cole Koester
- Week 11:
- Nov 2: Paxos and Raft, Ke
- Nov 4: Hadoope and Mapreduce, Sampath GK, Chinmay T
- Week 12:
- Nov 9: Cloud Consistency, Davide Giacomini
- Nov 11: Cloud Storage I, Riccardo Nannini
- Week 13:
- Nov 16: Cloud Storage II, Chen Chen, Chao Wu
- Nov 18: Zookeeper, Suhail Muhammed
- Week 14:
- Nov 23: Concurrent Data Structures, Vineeth Thapeta
- Week 15:
- Nov 30: Blockchain and bitcoin, Alekh Meka, Devansh Patel
- Dec 2: Consensus and cryptocurrencies, Polly Planinsek, Theodore Planinsek
- Dec 3 (Fri), 12 noon: Term paper PDF due in Gradescope
- Final as per university timetable,
Wednesday Dec 8, 3:30pm - 5:30pm, in Gradescope. Same modality as Midterm.
Syllabus: Midterm syllabus + Chapters 12.1-12.4
Face Masks:
Masks covering both the mouth and nose must be worn at all times by all students, faculty, and staff while inside any campus building regardless of vaccination status. If you do not wear a mask, you will be asked to leave the classroom and will not be allowed back in class unless or until you wear a mask.
If you have forgotten your mask, you may pick one up from one of the student information desks on campus during the first two weeks of campus. Students who do not comply with the mask-wearing policy will be reported to the Dean of Students. Eating and drinking are not allowed in classrooms.