courses/dss/raft/README.md
This is a series of labs on a key/value storage system built with the Raft consensus algorithm. These labs are derived from the lab2:raft and lab3:kvraft from the famous MIT 6.824 course but rewritten in Rust. The following text material is also very influenced by the text material there.
In these labs you will first implement the raft consensus algorithm in the lab2:raft, and then build a key/value service in the lab3:kvraft.
Raft is a consensus algorithm that is designed to be easy to understand.You can read material about the Raft itself at the Raft site, including the extended Raft paper, an interactive visualization of the Raft, and other resource. Those material should be helpful for you to complete this lab.
First, please clone this repository with git to get the source code of the labs.
Then, make sure you have rustup installed. Also, to make things simpler you
should have make installed.
Now you can run make test_others to check that things are going right. You
should see all tests passed.
(If you are a Windows user, you may have to figure out how to use make on
Windows, or type the commands from the makefile in the console manually, or just
use the Windows Subsystem for Linux)
In this lab you will implement the Raft consensus algorithm. This lab has 3 parts named with 2A, 2B and 2C.
To run all the test in this lab, run make test_2. Please run the tests multiple
times to mak sure your are not passing the tests just by luck.
To run just a single test, run make cargo_test_<insert test name here>.
All your codes in this lab should be in the src/proto/mod.rs,
src/proto/raft.proto and src/raft/mod.rs.
The src/raft/mod.rs file should contains your main implementation of Raft.
The tester (and your key/value server in lab3) will call methods in this file
to use your Raft module.
A service calls Raft::new to create a Raft peer, and then calls Node::new to
start the Raft peer. The Node::get_state, Node::is_leader and Node::term
will be called to get the current term of the node and whether it thinks it is
leader.
The Node::start will be called when the server need to append the command into
the log. Node::start should returns immediately without waiting for the log
appends to complete. A channel (apply_ch) is passed into Raft::new and you
should send a ApplyMsg to the channel for each newly committed log entry.
Your implements should use the provided labrpc crate to exchange RPCs. The
labrpc use channels to simulate sockets internally. This makes it easy to test
your code under challenging network conditions. Your definition of the RPCs
should in src/proto/mod.rs and you should implement the RPC server in
impl RaftService for Node. A set of RPC clients (peers) is pass into
Raft::new for you to send RPCs to other nodes.
In this part you should implement the leader election and heartbeats
(AppendEntries RPCs without log entries). You should make a single leader to be
elected, make leader to remain leader when there are no failures, and have a new
leader when old leader fails or packets to/from old leader is lost.
To run all the test in this lab, run make test_2a.
Here are some hints on this part:
Raft struct.request_vote RPC is already defined, you just need to fill the
RequestVoteArgs and RequestVoteReply struct. The lab use labcodec crate to
encode and decode messages in RPC, which internally use the prost external
crate. See the prost document to know how to define structs that is used
as messages with #[Derive(Message)] and #[prost(...)].append_entries RPC by yourself. labrpc use a
labrpc::service! macro to define RPC service and generate server and client
traits from your definition. There is an example in labrpc/examples/echo.rs
which may help you to define new RPCs.futures external crate heavily like the channels
and the Future trait. Read things about futures here.std::thread::sleep (doc), but you
can also use the futures_timer::Delay and other utilities from the
futures-timer external crate (doc).rand external crate (doc).log crate (doc) to print logs in different levels. you can configure
the log level and scope by set LOG_LEVEL environment variable like
LOG_LEVEL=labs6824=debug make test_2a. This feature is provided by the
env_logger external crate (doc), read it's documentation for
syntax of the log level. Also, you can collect the output in a file by redirecting
the output to a file like make test_2a 2>test.logIn this part you should implement the log replication. You should implement the
Node::Start method, complete the rest fields in the append_entries RPC and
send them, and advance commit_index at leader.
To run all the test in this lab, run make test_2b. You can try to pass the
test_basic_agree_2b test first.
Here are some hints on this part:
apply_ch in the
correct order. apply_ch is a UnboundedSender which will buffering the messages
until out of memory, so it is not that easy to create deadlocks as the original
go version.test_count_2b requires your number of RPCs to be not too much when there
is no failure. So you should optimize the number of RPCs to the minimum.In this part you should implement persistence by first adding code that saves and
restores persistent state to Persister, like in Raft::persist and
Raft::restore using labcodec. You also need to determine what and when to
persist, and call Raft::restore in Raft::new.
To run all the test in this lab, run make test_2c. You can try to pass the
test_persist1_2c test first.
Here are some hints on this part:
labcodec is covered in the hints of part 2A.In this lab you will build a fault-tolerant key-value storage service using the Raft module in lab 2. This lab has 2 parts named with 3A and 3B.
To run all the test in this lab, run make test_3. Please run the tests multiple
times to mak sure your are not passing the tests just by luck.
To run just a single test, run make cargo_test_<insert test name here>.
All your codes in this lab should be in the src/proto/mod.rs, src/proto/kvraft.proto,
src/kvraft/server.rs and src/kvraft/client.rs. The file name explains what
they are. Also, you need to modify the files you touched in lab2:raft.
In this part you should first implement a solution that works when there are no
dropped messages, and no failed servers. Your service must ensure that get(...)
and put_append(...) are linearizable.
That means, completed application calls to the methods on the Clerk struct in
src/kvraft/client.rs must appear to all clients to have affected the service in
the same linear order, even in there are failures and leader changes. A
Clerk::Get that starts after a completed Clerk::Put or Clerk::Append should
see the value written by the most recent Clerk::Put or Clerk::Append in the
linear order. Completed calls should have exactly-once semantics.
A reasonable plan of implementation should be:
src/kvraft/client.rsraft::Node::start in the
RPC handler of KvServerAfter implement that you should pass the basic one client test, run
make cargo_test_basic_3a to check it.
Here are some hints on this part:
apply_ch at the same time you
receives RPC.Raft::start for a Clerk's RPC, but loses its leadership
before the request is committed to the log, you should make the client to re-send
the RPC request to other servers until it finds the new leader. You can detect
this by check things received from apply_ch.Clerk client should remember who is the last leader, and try the last
leader first. This will avoid wasting time searching for the leader on every RPC.get RPC if it is not part of a majority and
do not has up-to-date data. You can just put the get operation into the log, or
implement the optimization for read-only operations that is described in Section 8
in the Raft paper.Then, you should deal with duplicate client requests, including situations where the client sends a request to a server leader in one term, times out waiting for a reply, and re-sends the request to a new leader in another term. The request should always execute just once.
After this you should pass all tests in this part. To run all the test in
this lab, run make test_3a.
Here are some hints on this part:
In your current implementation, a rebooting server replays the complete Raft log in order to restore its state. However, it's not practical for a long-running server to remember the complete Raft log forever.
Instead, you'll modify Raft and kvserver to cooperate to save space: from time to time kvserver will persistently store a "snapshot" of its current state, and Raft will discard log entries that precede the snapshot. When a server restarts (or falls far behind the leader and must catch up), the server first installs a snapshot and then replays log entries from after the point at which the snapshot was created. Section 7 of the Raft paper outlines the scheme; you will have to design the details.
The tester passes a maxraftstate to the KvServer::new indicating the maximum
allowed size of your persistent Raft state in bytes (including the log, but not
including snapshots). You should check the size of Raft state and when Raft state
size is approaching this threshold, it should save a snapshot, and tell the Raft
library that it has snapshotted, so that Raft can discard old log entries.
The maxraftstate is a Option<usize> and you do not have to snapshot when it
is None.
First you should modify the Raft implement to accept a compaction request and discard entries before the given index, and continue operating while storing only log entries after that index. The tests from lab2:raft should still pass.
Then you should modify the kvserver so that it can hands snapshots to Raft and
request compaction when Raft state grows too large. The snapshots should be saved
in raft::Persister
Here are some hints on this part:
maxraftstate
to Some(1).raft::Persister.After that, you should define the install_snapshot RPC and the leader should
send this RPC when the leader has discarded the log entries the follower needs.
When a follower receives an install_snapshot RPC, it should send the snapshot to
kvserver (maybe in apply_ch).
After this you should pass all tests in this part. To run all the test in
this lab, run make test_3b.
Here are some hints on this part:
install_snapshot RPC and that should be enough for this lab.test_snapshot_rpc_3b first