Adaptive Replication
The Internet and the World-Wide-Web are
rapidly moving us towards a distributed, wholly interconnected
information environment. In this environment
an object will be accessed, i.e. read and written,
from many locations that may be geographically distributed
world-wide.
In such an environment performance and cost will be of crucial
importance, and replication is an approach
for addressing both issues.
For example, the Internet news is a gigabyte database highly
replicated service, and it responds to queries in seconds;
in contrast, many nonreplicated servers are much slower.
However, massive replication can become very expensive
when a large number of database updates occur, since these are
usually propagated to many replicas. Caching is often used
to address this problem of replication being sometimes
advantageous
and sometimes detrimental. Caching creates/deletes replicas
at a client, as need dictates.
Caching is a particular case of our proposed concept of
adaptive replication.
Adaptive replication means that the number of copies of an
object, the location of these copies, and which updates are
propagated to each copy, change dynamically
depending on a given cost function.
Caching is a particular case of adaptive replication in the
sense that it is usually analyzed in a client/server
architecture,
in a given consistency model, in order to improve performance at
a client.
Adaptive replication is a more
general concept, that can be applied to richer architectures,
for multiple consistency models, and for optimizing different
cost functions.
The precise interpretation of adaptive replication depends on
the objective cost function. For example, it can improve
performance in
a distributed system for one objective function, and
it can reduce access cost in a client-server system for another
function.
The project has three main objectives.
First, establish the fundamental principles
of adaptive replication in distributed systems.
Second, develop, implement and analyze a
system for adaptive replication of objects.
Ideally, by parameterizing the Adaptive Replication (AR)
system with various cost
functions, and availability and consistency requirements,
we will produce different adaptive replication algorithms.
In practice, we expect that deriving an algorithm for a
particular environment will require
more than specifying parameters to the AR system,
but we will strive to minimize the specialized code.
The third objective of the project is to
build a simulation testbed in which the user defines the
network, selects a replication algorithm, selects a cost model,
and an access pattern at each computer. The system will
enable comparison among the various adaptive and static
algorithms.
The project is directed by
Ouri Wolfson.
Recent Papers
-
An Adaptive Data Replication Algorithm
ACM Transactions on Database Systems, Vol. 22(4), June 1997,
pp. 255-314.
-
"Infrastructure and cost models for digital libraries",
ACM Computing Surveys (Electronic version).
Vol. 28A, Dec. 1996.
-
"Strategic Directions in Electronic Commerce and
Digital Libraries: Towards a Digital Agora",
ACM Computing Surveys
Vol. 28(4), Dec. 1996, pp. 818-835.
-
"Towards a Theory of Cost Management for Digital Libraries",
accepted for publication in the
ACM Transactions on Database Systems (TODS),
Vol. 23(4), Dec. 1998, pp. 411-452.
-
Semantic Multicast: Intelligently Sharing Collaborative Sessions,
ACM Computing Surveys, Vol. 31(2es), 1999.
-
An Architecture for Consumer-Oriented Online Database Services
Proceedings of RIDE-NDS'96 the 6th International
Workshop on Research Issues in Data Engineering:
Interoperability of Nontraditional Database Systems,
New Orleans, LA, Feb. 1996
-
"An Algorithm for Dynamic Data Allocation in Distributed Systems"
Information Processing Letters (IPL),
Vol. 53, 1995, pp. 113-119.
-
A Competitive Dynamic Data Replication Algorithm
Proceedings of the 9-th International Conf. on Data
Engineering, pp. 310-317, April 1993, Vienna, Austria
-
Competitive Analysis of Caching in Distributed Databases
IEEE Transactions on Parallel and Distributed Systems,
9(4), Apr. 1998, pp. 391-409.
-
Divergence Caching in Client-Server Architectures
In
Proceedings of the third International Conference
on Parallel and Distributed Information Systems (PDIS),
Austin, TX, Sept. 1994, pp. 131-139.
-
An Algorithm for Dynamic Data Distribution
In
Proceedings of the 2nd Workshop on the Management of
Replicated Data (WMRD-II), Monterey, CA, Nov. 1992.
-
Distributed Algorithms for Adaptive Replication of Data
In
Proceedings of the 11th ACM SIGACT-SIGMOD-SIGART Symposium on
Principles of Database Systems (PODS), San Diego, CA, June 1992.
Contact me at
my last name@cs.uic.edu