Adaptive Replication

The Internet and the World-Wide-Web are rapidly moving us towards a distributed, wholly interconnected information environment. In this environment an object will be accessed, i.e. read and written, from many locations that may be geographically distributed world-wide. In such an environment performance and cost will be of crucial importance, and replication is an approach for addressing both issues. For example, the Internet news is a gigabyte database highly replicated service, and it responds to queries in seconds; in contrast, many nonreplicated servers are much slower. However, massive replication can become very expensive when a large number of database updates occur, since these are usually propagated to many replicas. Caching is often used to address this problem of replication being sometimes advantageous and sometimes detrimental. Caching creates/deletes replicas at a client, as need dictates. Caching is a particular case of our proposed concept of adaptive replication.

Adaptive replication means that the number of copies of an object, the location of these copies, and which updates are propagated to each copy, change dynamically depending on a given cost function. Caching is a particular case of adaptive replication in the sense that it is usually analyzed in a client/server architecture, in a given consistency model, in order to improve performance at a client. Adaptive replication is a more general concept, that can be applied to richer architectures, for multiple consistency models, and for optimizing different cost functions. The precise interpretation of adaptive replication depends on the objective cost function. For example, it can improve performance in a distributed system for one objective function, and it can reduce access cost in a client-server system for another function.

The project has three main objectives. First, establish the fundamental principles of adaptive replication in distributed systems. Second, develop, implement and analyze a system for adaptive replication of objects. Ideally, by parameterizing the Adaptive Replication (AR) system with various cost functions, and availability and consistency requirements, we will produce different adaptive replication algorithms. In practice, we expect that deriving an algorithm for a particular environment will require more than specifying parameters to the AR system, but we will strive to minimize the specialized code. The third objective of the project is to build a simulation testbed in which the user defines the network, selects a replication algorithm, selects a cost model, and an access pattern at each computer. The system will enable comparison among the various adaptive and static algorithms. The project is directed by Ouri Wolfson.



Recent Papers