Recent work on semi-structured data has revitalized the interest in path queries, i.e., queries that ask for all pairs of objects in the database that are connected by a path conforming to a certain specification, in particular to a regular expression. Also, in semi-structured data, as well as in da
Algebraic rewritings for optimizing regular path queries
✍ Scribed by Gösta Grahne; Alex Thomo
- Publisher
- Elsevier Science
- Year
- 2003
- Tongue
- English
- Weight
- 189 KB
- Volume
- 296
- Category
- Article
- ISSN
- 0304-3975
No coin nor oath required. For personal study only.
✦ Synopsis
Rewriting queries using views is a powerful technique that has applications in query optimization, data integration, data warehousing, etc. Query rewriting in relational databases is by now rather well investigated. However, in the framework of semistructured data the problem of rewriting has received much less attention. In this paper we focus on extracting as much information as possible from algebraic rewritings for the purpose of optimizing regular path queries. The cases when we can ÿnd a complete exact rewriting of a query using a set a views are very "ideal". However, there is always information available in the views, even if this information is only partial. We introduce "lower" and "possibility" partial rewritings and provide algorithms for computing them. These rewritings are algebraic in their nature, i.e. we use only the algebraic view deÿnitions for computing the rewritings. We do not use any pairs (tuples) of objects for computing the rewritings. This fact makes them a main memory product, which can be used for reducing secondary memory and remote access. After the main memory algebraic computation of the rewritings there is a second phase, with secondary memory access, for deriving the pairs of objects in the query answer. We give two algorithms for utilizing the partial lower and partial possibility rewritings to decrease the number of secondary memory accesses.
📜 SIMILAR VOLUMES
There is a significant amount of interest in combining and extending database and information retrieval technologies to manage textual data. The challenge is becoming more relevant due to increased availability of documents in digital form. Document data has a natural hierarchical structure, which m