Notes on Two Phase Locking and Commit Protocols

General Notes

Two phase locking is a process used to gain ownership of shared resources without creating the possibility for deadlock. The technique is extremely simple, and breaks up the modification of shared data into "two phases", this is what gives the process its name.

There are actually three activities that take place in the "two phase" update algorithm:

Lock Acquisition
Modification of Data
Release Locks

The modification of data, and the subesquent release of the locks that protected the data are generally grouped together and called the second phase.

Two phase locking prevents deadlock from occuring in distributed systems by releasing all the resources it has acquired, if it is not possible to obtain all the resources required without waiting for another process to finish using a lock. This means that no process is ever in a state where it is holding some shared resources, and waiting for another process to release a shared resource which it requires. This means that deadlock cannot occur due to resource contention.

The resource (or lock) acquisition phase of a "two phase" shared data access protocol is usually implemented as a loop within which all the locks required to access the shared data are acquired one by one. If any lock is not acquired on the first attempt the algorithm gives up all the locks it had previously been able to get, and starts to try to get all the locks again.

This "back-off and re-try" strategy can be a problem in distributed systems. It is not guaranteed to give access to the desired resources within a finite time. This can lead to process starvation, if a single process never acquires all the locks needed for it to continue exectuion. This is a problem for real-time systems. Consequently, two phase locking protocols cannot be used in hard realtime applications.

In addition, more complicated problems arise when two processes compete over locks in such a way that the pattern of back-off and re-try attempts always leads them to conflict over locks in a sequence that means that neither process will ever get all the locks it needs to commit a transaction. One way to prevent this from happening is to order the locks in a global sequence, and require the processes to acquire the locks in that sequence, then both livelock and deadlock are eliminated from the algorithm.

Consider the example where processes gain access to the shared space in a traffic intersection using semaphores that represent ownership of spaces in the intersection. Each car has two possible direction choices as it enters the intersection and the algorithms executed by cars entering the intersection from different directions are shown in the following table. Refer to the picture for and image of the intersection and the allocation of compass points and semphores to areas.

A sample intersection

A set of access algorithms that uses incremental locking and allows deadlock.

Car from South

Car from West

Car from East

Go_East:
    Wait(C)
     Turn_Right 
    Signal(C)
Go_West:
    Wait(C)
    Wait(A)
     Turn_Left 
    Signal(A)
    Signal(C)

Go_East:
    Wait(B)
    Wait(C)
     Drive_Straight 
    Signal(C)
    Signal(B)
Go_South:
    Wait(B)
     Turn_Right 
    Signal(B)

Go_West:
    Wait(A)
     Drive_Straight 
    Signal(A)
Go_South:
    Wait(A)
    Wait(B)
     Turn_Left 
    Signal(B)
    Signal(A)

A set of access algorithms that uses incremental locking and prevents deadlock.

Car from South

Car from West

Car from East

Go_East:
    Wait(C)
     Turn_Right 
    Signal(C)
Go_West:
    Wait(A)
    Wait(C)
     Turn_Left 
    Signal(A)
    Signal(C)

Go_East:
    Wait(B)
    Wait(C)
     Drive_Straight 
    Signal(B)
    Signal(C)
Go_South:
    Wait(B)
     Turn_Right 
    Signal(B)

Go_West:
    Wait(A)
     Drive_Straight 
    Signal(A)
Go_South:
    Wait(A)
    Wait(B)
     Turn_Left 
    Signal(B)
    Signal(A)

Two Phase Commit

The intention of all locking and transaction protocols is to produce an atomic update to distributed shared data, or to replicas of a shared data item. In the case of two phase commit algorithms for transactions on distributed databases the intention is to prevent an update being carried out on only one of several replicas, since this would make the replicas inconsistent with each other. Therefore, the operation is either "successful on all replicas" or "aborted".

The only way that a two phase commit can be implemented safely is for each of the replicas to have some knowledge of where all the participating entites are in the process of updating the shared information. If each of the entities keeps a diary of what it has been doing then they can crash and re-start without becomming confused, or allowing inconsistent data states to develop. Again, two phases are used,

Start Protocol, write commencement information into log, send transaction to other participants asking them to "commit" that transaction and requesting a response.
Collect and log responses from other participants, if everyone is ready write "commit" to the log, then send a "commit" message to all other participants. The participants write commit to their logs, commit the transaction locally, and send a "finished" message to the originator.

When collecting responses from the other participants the originator of a transaction cannot wait forever for all the other databases to respond, this will require a timeout to be implemented in the section of the protocol that initiates a commit request.

Hints

To implement a two phase protocol of this type you need to register all the entities with one another, so that all the participants in the protocol are known. Since the practical exercise does not require the middleware to extend the protocol during execution you should be able to set up a registry of the known database copies at startup. This will probably be done by recording the socket id's that are used to communicate with the other database servers.

Your initial test of a commit process can be conducted using simple processes which implement only the basic algorithm and only update the log files with appropriate messages. This means that you can test the protocol without having to worry about the updates on the back-end databases initially. Once you have a working protocol then you can add in the operations to change the data in the back-end database servers.

Important References

Tann, Section 3.4, pages 152 to 158.