Consistency Models and the WWW

Arnold Pears
arnoldp@DoCS.uu.se
Uppsala University
SWEDEN

Introduction

The problem occurs when the processing elements (PE's) of the machine are able to communicate with a shared logical or physical memory space independently of each other. This means that PE's cannot observe the memory transactions of other PE's and this means that the caches of the PE's can hold values that are loaded from memory, but for which the value has been changed since the value was loaded. Consistency protocols are sets of hardware rules and operating system algorithms that try to minimise occurences of the types of situation that cause cache data to become inconsistent with the data in other caches and the main memory store.

Figure 1 shows the standard design of a Distributed Shared Memory (DSM) computer in which all the processing elements have access to a single shared data space that is used to store the program and the data (program variables and dynamic data structures). Such machines have been popular since the early 1980's and the problems of creating scalable designs has been widely studied.

A large number of models for how the memory and caches should be made the same (consistent) have been proposed. These protocols have been discussed in a number of IEEE and other computer science research journals and are too complicated to present here. The important thing to realise about them is that they all aim to make the caches and memory content the same for all relevant addresses when this is necessary for program correctness. The difference between the protocols lies in the cost associated with creating the consistency. Some consistency schemes enforce consistency all the time, others (more efficient ones) operate only when absolutely required.

The structure shown in Figure 2 is also a popular design for computers. However, here the memory attached to each processing element is private, and can only be accessed by the attached processor. Other processors need to communicate with the owner of a memory address to access its contents. This means that local copies can be requested by all the nodes and stored. This is shown in the diagram where all the caches store a value obtained from a single node's memory. In this type of computer the consistency of the data is often the responsibility of the program.

One variant of this second design implements a logical shared address space using the physically distributed hardware that is depicted. In this type of design (DSM) the memory subsystem operating system and program are involved in making sure that data is consistent at the times when data consistency is necessary for program correctness.

A picture of a private data space architecture

Consistency

One major problem in implementing protocols that ensure data consistency in any architecture is keeping track of where all the cached copies of a memory location might be. This problem is illustrated in Figure 1 where we can see that the yellow memory location is duplicated in two caches. If either of the processors in this figure updates the value of the specified variable the values stored in the yellow locations will no longer be the same, the caches and memory can be made the same again, but this takes time and information about the location of the copies.

There are two major methods for achieving consistency between the caches and the memory. Either the caches can be updated, so that they hold the new value of the memory (update consistency), or the value of the address held in the cache can be discarded (invalidation consistency) Each of these relies on being able to locate all the locations where a copy of the address is stored.

Another major main problem is, "what can the computer do while the caches and memory are being brought into a consistent state?". In some cases nothing, in other cases the computer can continue to execute the program knowing that the use of "possibly incorrect values" will not cause a problem. This situation can be exploited to increase the efficiency of execution of programs.

An Overview of Consistency Models

Relative efficiency of consistency schemes shown as a time delay diagram

Processor Consistency (PC)

Weak Consistency (WC)

accesses to synchronization variables are sequentially consistent and
no access to synchronization variable is issued in a processor before all previous data accesses have been performed and
no access is issued by a processor before a previous access to a synchronization variable has been performed.

there are no competing accesses;
synchronization is visible to the memory system hardware.

Release Consistency (RC)

Consistency for the WWW

Why does the WWW need consistency?

Applying Consistency to the WWW

How are traditional parallel computer consistency protocols of interest in terms of developing consistency approaches for the WWW?

Consider how caches are used and how consistency protocols (algorithms) work to make the contents of caches and the original item consistent with one another. If we now draw on the ideas of updating and invalidating copies that have been developed in the last 20 years for multiprocessor computers, is it possible to use this information to help us in solving the problems that arise when we try to distribute and cache WWW information using "proxies" and "browser caches"?

To do this we have to look at how the existing algorithms might be relevant to maintaining and replicating computer information in an address space that is constructed in a more "ad hoc" manner. To do this we can try to view URL's as addresses, and to use caches in "proxies" and "browsers" to replicate these items for local use.

However, the architecture that we are confronted with is not completely analagous to that for either of the two hardware models shown in Figures 1 and 2. In fact the architecture is rather more complex due to several layers of caching, which introduces the notion of cache heirarchy in the WWW. Other problems include

The address space is not regular
Cache locations are not fixed and cannot easily and consistently be recorded and identified.
Protocols for consistency between caches at different levels of the WWW cache hierarchy are not well defined.

Traditional caching approaches attempt to gain performance improvements based on two observations about program behaviour.

Program data references are clustered in the address space. A program that accesses a data item at address A will very often access another address close to A almost immediately. This property of program behaviour is called spatial locality .
Program data references to an address A tend to be clustered in time. That is, if the program accesses the address A then within a short time span it will be highly likely to access the address A again. As the time quantum since A was last accessed increases the probability of accessing A again decreases. This property is called temporal locality .

References

[Ghar90] K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J. Hennessy., "Memory consistency and event ordering in scalable shared-memory multiprocessors.", Computer Architecture News, 18(2A):15--26, June 1990.

[Good89] J.R. Goodman., "Cache consistency and sequential consistency.", Technical Report Technical Report 61, SCI Committee, March 1989.

[Pear00] A. Pears, "The Forgetful Cache Consistency Protocol", Proc.