Probabilistic databases are databases where the value of some attributes or the presence of some records are uncertain and known only with some probability. Applications in many areas such as information extraction, RFID and scientific data management, data cleaning, data integration, and financial
Probabilistic Ranking Techniques in Relational Databases
โ Scribed by Ihab F. Ilyas, Mohamed A. Soliman
- Publisher
- Morgan & Claypool
- Year
- 2011
- Tongue
- English
- Leaves
- 81
- Series
- Synthesis Lectures on Data Management
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
Ranking queries are widely used in data exploration, data analysis and decision making scenarios. While most of the currently proposed ranking techniques focus on deterministic data, several emerging applications involve data that are imprecise or uncertain. Ranking uncertain data raises new challenges in query semantics and processing, making conventional methods inapplicable. Furthermore, the interplay between ranking and uncertainty models introduces new dimensions for ordering query results that do not exist in the traditional settings. This lecture describes new formulations and processing techniques for ranking queries on uncertain data. The formulations are based on marriage of traditional ranking semantics with possible worlds semantics under widely-adopted uncertainty models. In particular, we focus on discussing the impact of tuple-level and attribute-level uncertainty on the semantics and processing techniques of ranking queries. Under the tuple-level uncertainty model, we describe new processing techniques leveraging the capabilities of relational database systems to recognize and handle data uncertainty in score-based ranking. Under the attribute-level uncertainty model, we describe new probabilistic ranking models and a set of query evaluation algorithms, including sampling-based techniques. We also discuss supporting rank join queries on uncertain data, and we show how to extend current rank join methods to handle uncertainty in scoring attributes. Table of Contents: Introduction / Uncertainty Models / Query Semantics / Methodologies / Uncertain Rank Join / Conclusion
โฆ Table of Contents
Introduction......Page 11
Tuple Level Uncertainty......Page 12
Attribute Level Uncertainty......Page 13
Challenges......Page 15
State-of-the-art......Page 16
Tuple Uncertainty Models......Page 19
Attribute Uncertainty Models......Page 23
Discrete Uncertain Scores......Page 24
Continuous Uncertain Scores......Page 25
Mode-based Semantics......Page 29
Aggregation-based Semantics......Page 32
Applications......Page 36
UTop-Prefix Under Tuple Uncertainty......Page 37
UTop-Prefix Under Attribute Uncertainty......Page 40
Monte-Carlo Simulation......Page 42
Computing UTop-Rank Query......Page 43
Computing UTop-Prefix and UTop-Set Queries......Page 44
Dynamic Programming......Page 45
UTop-Rank Query under Independence......Page 46
Generating Functions......Page 47
Probabilistic Threshold......Page 49
Typical Top-k Answers......Page 50
Expected Ranks......Page 52
Uncertain Rank Aggregation......Page 53
Uncertain Rank Join Problem......Page 57
Computing the Top-k Join Results......Page 59
Join-aware Sampling......Page 62
Incremental Ranking......Page 64
MashRank Architecture......Page 66
Information Extraction......Page 69
Mashup Planning......Page 70
Conclusion......Page 75
Bibliography......Page 77
Authors' Biographies......Page 81
๐ SIMILAR VOLUMES
In recent years, there has been an upsurge of interest in using techniques drawn from probability to tackle problems in analysis. These applications arise in subjects such as potential theory, harmonic analysis, singular integrals, and the study of analytic functions. This book presents a modern sur
<p><p>This book covers a fast-growing topic in great depth and focuses on the technologies and applications of probabilistic data management. It aims to provide a single account of current studies in probabilistic data management. The objective of the book is to provide the state of the art informat