The goal for this project is to implement a reliable distributed system.

You should do this project in groups of 3 or 4.

The service is due on Thursday, March 24th, and test results are due on March 29th.

Updates

The first phase of the project is due at Midnight on March 24th.
The protocol will be changed to use fixed port numbers
To turn in your project, please send the instructor email with a location (AFS path or URL) where your project can be downloaded, with scripts & writeup.
Please finalize changes to the specification by Friday, March 11th at 5 pm.
I've put a list of the tests we used last year under the "Testing" section.
Many program last year failed under some of these tests, so they were neither reliable nor available nor consistent.
There are virtual machine images and scripts available in ~cs739-1/public/projects/p1

Service

The service to provide is a simple key-value store, where keys and values are strings. You will use the HTTP protocol for access to the service. You service should be as consistent as possible, so that requesting a value should return the most recently value set as often as possible. Furthermore, your service should tolerate failures of a process, node, or network.

The service should run on between 1 and 4 nodes, and should continue to provide service when only some of the nodes are available. You can assume the complete set of nodes is provided when your service starts. A client should be able to request service from any machine running your service.

You can assume that at least one node will always be running, so you do not need to store data persistently (unless you want to). You can also assume that the total size of all the data will be small (less than 10 megabytes).

Fault tolerance

The goal of this work is to get experience implementing consistent, fault-tolerant services. Hence, the primary criteria for your work is the ability whether your service can continue to provide consistent results in the presence of failures.

To implement this, you will need to implement some form of replication, which ensure that data is available even when one of the machines goes down.

Here is a survey paper on replication, and a PowerPoint presentation on replication in databases.

You may also want to implement some kind of failure detector, so your service knows when its peers are unavailable. Here is a bibliography of failure detectors.

You may find it helpful to read about Amazon's Dynamo system.

The four servers all run in distinct virtual machines. The client runs on a standard CSL workstation. The client and the four VMs are the only machines in the system.

Specification

The specification of the service is on the Project 1 Specification page. This will initially be a proposed specification, in that the class may edit it to converge on a better one.

You can assume that:

Keys are valid printable strings without special characters and are 128 or less characters in length. They cannot include the "[" or "]" characters.
Values are valid printable strings without special characters (or UUencoded) and 2048 or less characters in length. They cannot include the "[" or "]" characters.

Failures

Your service should tolerate:

System crash failures by stopping/unplugging the machine running your service
Process crash failures by killing the processes implementing your service
Network failures where the network connecting two machines is cut (virtually or physically).

Implementation

You may use any implementation language for the server and client, as long as it implements the required command line interface and protocol. For example, you can use regular sockets, RPC Google's protocol buffers for communication between your servers, or use the existing HTTP implementation. We will provide a simple implementation of a web server from which you may start, if you wish.

You should write a script to start/configure your server.

Testing platform

You should test your code using VMware or QEMU. To simulate multiple physical machines, you can run multiple virtual machines simultaneously.

Hints

Here are some standard distributed system techniques you may want to look at:

Leases
Quorom-consensus replication
|The CAP theorem -- you cannot build a system that is always consistent, always available, and can survive partitions
Failure detectors

Testing

In addition to developing the service, you will also develop tests that should work with any service that implements the specification. Each group will be asked to test the services of a few other groups.

Your test should test one (or more) of these properties of the service

Latency - how fast can it reply to a single request
Throughput - how many requests per second can it serve
Consistency - is the data returned always the most up-to-date version of the data?
Availability - is the service always available or does it return failures?
Partition tolerance - if the nodes of the service are partitioned, does the service still provide consistency/availability?

Please create a test for at least one non-performance metric.

In testing, you should invoke your client program from a script. This program should try to expose the limits of the service. You can assume that some of the virtual machines/servers will be killed during the test or partitioned. It should print out information about the availability (how long it takes to reply when there is a failure) and consistency (when results are not strictly consistent)

These are the tests that were used last year:

Throughput - Randomly read/write 100 keys to different servers 500-5000 times from 10 processes
Base consistency - Write 40 keys in order to different servers 500-1000 times. Results are consistent if there is a total order of requests: the chain of (old value, new value) pairs contains all values written.
Failure - Same as base consistency tests, but kill one node partway through. All requests are sent to the first two servers and the 4th one is killed.
Manual partition - Insert a key, verify it has the same value when read with 2 different servers provided at command line. Then create a partition, and update the value at servers on either side of the partition. Check if reads to those servers return same value or not. Unpartition servers, and then read to see if values have converged
Stress partition - Base consistency test, but partition servers partway through and then unpartition. Clients can see all nodes but servers are partitioned.

Scripts

Scripts for copying files up to the virtual machines and for launching servers are in the same directory and names update.sh and launch.sh
A script for partitioning the network is available in ~cs739-1/public/dopartition.sh. To use it, you'll have to edit the sudoers file by running sudo visudo and adding the line: %admin ALL=(ALL) NOPASSWD:/sbin/iptables at the end.

What to turn in

On the first turn-in day, turn in your service for other people to test:

Your service, installed in a virtual machine (more details on how to do this later)
A short (few pages) description of your system, explaining your design and how you trade off consistency, availability, partition tolerance, and performance.
Your tests for testing other people's service, including a short README file explaining what it does.

On the second turn in day, turn in the result of testing

Results of running your tests on the services of four other groups.

It may be that your tests don't work on other services, or that others services don't work for your tests. You should cooperate with the other groups to fix both your services and tests to be as useful as possible. When you turn in the results, please include a short description of changes you had to make to your tests and changes other groups had to make to their service.

Evaluation

Performance is not the primary concern of this project. We are instead interested in the ability to return consistent results under failure conditions.

the availability of the service: the ability whether your service can continue to provide consistent results in the presence of failures
the consistency of your service, measured by how often it returns incorrect (out of date) values
The quality of your design (did you make good choices) and your write up (do you explain and motivate your design choices).

In addition, we will consider (as secondary considerations):

the scalability of the service; meaning the impact of having more copies on performance
the efficiency of your service, measured by bytes/messages transmitted between servers

For the writeup, we will look for these things:

Does the writeup adequately describe the project?
Does the design make sense?
Does the described test adequately test the guarantees of the design?

Key-ValueStore