The problem of query containment is fundamental to many aspects of database systems, including query optimization, determining independence of queries from updates, and rewriting queries using views. In the data-integration framework, however, the standard notion of query containment does not suffic
Recursive query plans for data integration
โ Scribed by Oliver M. Duschka; Michael R. Genesereth; Alon Y. Levy
- Publisher
- Elsevier Science
- Year
- 2000
- Tongue
- English
- Weight
- 204 KB
- Volume
- 43
- Category
- Article
- ISSN
- 0743-1066
No coin nor oath required. For personal study only.
โฆ Synopsis
Generating query-answering plans for data integration systems requires to translate a user query, formulated in terms of a mediated schema, to a query that uses relations that are actually stored in data sources. Previous solutions to the translation problem produced sets of conjunctive plans, and were therefore limited in their ability to handle recursive queries and to exploit data sources with binding-pattern limitations and functional dependencies that are known to hold in the mediated schema. As a result, these plans were incomplete w.r.t. sources encountered in practice (i.e., produced only a subset of the possible answers). We describe the novel class of recursive query answering plans, which enables us to settle three open problems. First, we describe an algorithm for ยฎnding a query plan that produces the maximal set of answers from the sources for arbitrary recursive queries. Second, we extend this algorithm to use the presence of functional and full dependencies in the mediated schema. Third, we describe an algorithm for ยฎnding the maximal query plan in the presence of binding-pattern restrictions in the sources. In all three cases, recursive plans are necessary in order to obtain a maximal query plan.
๐ SIMILAR VOLUMES
In this paper we present the design of two essential components for the spatial querying system we have been developing. The overall system architecture utilizes multiple levels of agents to process external sources of spatial data. Upon a user query, agents are spawned to mine various web sources,
In distributed environments, replication of data provides improved availability, isolation between workloads with different characteristics, and improved performance through local access to data. The "real data" is server resident and by "local data" we refer to cached client data. We examine which