Tuesday, September 20, 2011

Paxos Made Simple and Paxos Made Practical

Paxos is a distributed consensus algorithm formalized by Leslie Lamport.  The consensus problem is simple.  There are a group of processes which must agree on a single proposed value, which can be learned after the agreement.  For safety of the algorithm, only proposed valued can be chosen, only a single value can be chosen, and a learner cannot learn a value unless it was actually chosen.  For the basic paxos algorithm by Lamport, it is assumed that the set of processes do not change during the execution.  The protocol can be described as a 2 phase protocol.  It is different from 2 phase commit because 2PC requires all processes to respond, but paxos is based on majorities, which improves availability and response times.  In the first phase, proposers send PREPARE(n) messages to acceptors, and acceptors return promises not to accept anything less than n.  acceptors also return the value of the highest accepted proposal, with number less than n.  If the proposer gets promises from a majority of acceptors, it then sends ACCEPT(n, v) messages to acceptors, where v is the value of the highest proposal from acceptors, or a new value (if there was no accepted proposal yet).  This type of protocol ensures safety of the consensus, and with distinguished leader, can also ensure liveness.  Paxos can be used to implement the state machine distributed system model efficiently, because the leader can get phase 1 messages for an indefinite number of instances, so it only needs phase 2 messages to commit values.

"Paxos made practical" does not assume the acceptors are fixed and that they can fail or more can be added.  The set of cohorts in the group is maintained by paxos as well.  The view is the configuration of the cohorts, and a new view is agreed upon by using a paxos instance for a new view.  Using this protocol, new cohort configurations can be reliably modified to adapt to the changing environment.

Paxos is the standard distributed consensus algorithm for distributed systems today.  Paxos is used mostly for small amount of metadata or configuration information, but lately more systems have been using Paxos for more datastores to achieve consistency across several different nodes.  I think in the future, paxos will be used more often to achieve consistency across nodes, as well as a way to replicate data across nodes.  Paxos will be used to achieve different semantics along the consistency spectrum.  Since paxos is a quorum based protocol, unlike 2PC, better availability can be provided by the protocol.  By using paxos or some variation, different points in the CAP tradeoffs.  These different types of semantics will be useful for the different types of system requirements.  There may even be an asymmetric protocol where the different latencies between different nodes will be taken into account, in order to improve performance or provide certain guarantees.

No comments:

Post a Comment