CSE 223A: Principles of Distributed Computing
Winter 2004
Professor Keith Marzullo
Class meets MWF 2:00-2:50 in WLH 2206
You will be asked to do four
homeworks, a project and a final.
Together the homework is worth one third of your grade; the project
is worth a third and the final is worth a third.
Homework:
Homework 1, due 16 January 2004.
Homework 2, due 30 January 2004.
Homework 3, due 17 February 2004.
Homework 4, due 12 March 2004.
Project:
You will need to do a class
project which is due at the end of the tenth week. You can
either work by yourself or with one other person. I'll expect more from
a team project than an individual one. Once you decide what you'd like
to do, please let me know before you start; I'd like to have the chance
to help you refine your idea.
You project can be:
- A programming project.
We're covering a lot of protocols in this course. Try implementing one
of the protocols, like Paxos, snapshots, <>S consensus,
consenusus using Indistinguishable P, or quorum update. The goal of
such a project is both to understand the protocol better and to
understand the limitations (performance, implementability) in practice.
For example, if you were to implement <>S consensus, you'd need
to build a <>S failure detector. How do you do so?
- A survey paper. We're
covering a lot of ground in this course, and so are not going into any
topic very deep. Choose some topic that interested you and try going
deeper.
- A research paper. Several
questions will come up in lecture, and you may have others during this
course. You could refine one ofthe questions and investigate it as a
research topic. One general possibility is:
- Most fault-tolerance
techniques, especially those based on active replication, require
deterministic actions. Are there some nondeterministic actions that can
be allowed? Could one transform a nondeterministic service into one
that is deterministic? Could one use an optimistic approach to detect
when replicas diverge due to nondeterminism and use a separate recovery
protocol?
Papers:
For many of these, you will need to be using a campus-linked
computer to retrieve the paper.
Syllabus (subject to change):
1/5 Introduction: distributed
computing and
fault tolerance. (ppt, pdf)
1/7 Derivation of
snapshot protocol. (ppt, pdf)
1/9 Continuation of snapshots.
Vector
clocks. (ppt, pdf)
1/12 Vector
clocks; derivation of deadlock
detection protocol.
1/14 State
machine approach. (ppt, pdf)
1/16 Synchronous
systems. Consensus for crash
failures.
n > 3t for Byz. (ppt,
pdf)
1/19
Martin Luther King day
1/21 Early
stopping consensus. Consensus for
broadcast busses. (ppt, pdf)
1/23 (continuation
of last lecture)
1/26
ACP and
consensus. 2 phase commit. (ppt,
pdf)
1/28 Traditional
nonblocking commit protocols. (ppt,
pdf)
1/30 Homework
discussion.
2/2 Group
membership in a synchronous system (ppt,
pdf).
2/4 Nondeterminism:
primary-backup protocols
(ppt,
pdf).
2/6 Message
logging.
(ppt, pdf).
2/9
More on message logging.
2/11 Invited Lecture
2/13 No class
- professor at PC meeting
2/16
President's day
2/18 Communications
induced checkpointing..
2/20 FLP (ppt, pdf).
2/23
FLP
2/25 Quorums (ppt,
pdf)
2/27 Guest
lecturer Marc Shapiro, MSR Cambridge.
3/1
Byzantine Quorum Systems (ppt,
pdf)
3/3 Group
membership
(ppt, pdf)
3/5 <>S
Consensus
(ppt, pdf)
3/8
Paxos (ppt, pdf).
3/10 Byzantine Paxos
3/12 (continuation)