[Historical Document]

About Mariposa

[About Mariposa]

The Mariposa distributed database management system is an ongoing research project at the University of California at Berkeley. Mariposa addresses fundamental problems in the standard approach to distributed data management. We believe that the underlying assumptions traditionally made while designing distributed data managers do not apply to today's wide-area network (WAN) environments.

To date, distributed database management systems have been designed for local-area networks (LANs) with few servers operating within one administrative domain, such as one company or one department within a company. Furthermore, these systems assume uniformity of all processors and network connections within the system. Data movement in these systems is a very heavyweight operation and is performed manually by a database administrator. The explosive growth of distributed computing, illustrated by the World Wide Web, dictates an entirely different set of assumptions.

Mariposa allows DBMSs which are far apart and under different administrative domains to work together to process queries. Furthermore, we have introduced an economic paradigm in which processing sites buy and sell data and query processing services. Not only does this approach reflect the emerging reality of a commercialized Internet, it has also allowed us to address many of the problems inherent in designing a wide-area distributed DBMS. Mariposa has been designed with the following principles in mind:

  • Scalability to a large number of cooperating sites. In a WAN environment, there may be a large number of sites. Our goal is to scale to 10,000 servers.
  • Local autonomy. Each site must have control over its resources. This includes which objects to store and which queries to run. Query and data allocation cannot be done by a central, authoritarian query optimizer.
  • Data mobility. It should be easy and efficient to change the home of an object. Preferably, the object should remain available during movement.
  • No global synchronization. Updates and schema changes should not force a site to synchronize with all other sites. Otherwise, many common operations will have exceptionally poor response time.
  • Easily configurable policies. It should be easy for a local database administrator to change the behavior of a Mariposa site. A Mariposa system should respond gracefully to changes in user activity and data access patterns to maintain low response time and high system throughput.

Return to the Mariposa home page.

Modified: $Date: 1999/02/03 01:49:30 $ by $Author: aoki $