Efficient query evaluation on probabilistic databases
โ Scribed by Nilesh Dalvi; Dan Suciu
- Book ID
- 106234658
- Publisher
- Springer-Verlag
- Year
- 2006
- Tongue
- English
- Weight
- 684 KB
- Volume
- 16
- Category
- Article
- ISSN
- 1066-8888
No coin nor oath required. For personal study only.
โฆ Synopsis
We describe a framework for supporting arbitrarily complex SQL queries with "uncertain" predicates. The query semantics is based on a probabilistic model and the results are ranked, much like in Information Retrieval. Our main focus is query evaluation. We describe an optimization algorithm that can compute efficiently most queries. We show, however, that the data complexity of some queries is #P-complete, which implies that these queries do not admit any efficient evaluation methods. For these queries we describe both an approximation algorithm and a Monte-Carlo simulation algorithm.
Introduction
Databases and Information Retrieval [3] have taken two philosophically different approaches to queries. In databases SQL queries have a rich structure and a precise semantics. This makes it possible for users to formulate complex queries and for systems to apply complex optimizations, but users need to have a pretty detailed knowledge of the database in order to formulate queries. For example, a single misspelling of a constant in the WHERE clause leads to an empty set of answers, frustrating casual users. By contrast, a query in Information Retrieval (IR) is just a set of keywords and is easy for casual users to formulate. IR queries offer two important features that are missing in databases: the results are ranked and the matches may be uncertain, i.e. the answer may include documents that do not match all the keywords in the query. 1 While several proposals exist for extending SQL with uncertain matches and ranked results [1,20,15], they are either restricted to a single table, or, when they handle join queries, adopt an ad-hoc semantics.
๐ SIMILAR VOLUMES