𝔖 Scriptorium
✦   LIBER   ✦

📁

Probabilistic Databases

✍ Scribed by Dan Suciu, Dan Olteanu, Christopher Ré, Christoph Koch


Publisher
Morgan & Claypool Publishers
Year
2011
Tongue
English
Leaves
182
Series
Synthesis Lectures on Data Management
Edition
1
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Probabilistic databases are databases where the value of some attributes or the presence of some records are uncertain and known only with some probability. Applications in many areas such as information extraction, RFID and scientific data management, data cleaning, data integration, and financial risk assessment produce large volumes of uncertain data, which are best modeled and processed by a probabilistic database. This book presents the state of the art in representation formalisms and query processing techniques for probabilistic data. It starts by discussing the basic principles for representing large probabilistic databases, by decomposing them into tuple-independent tables, block-independent-disjoint tables, or U-databases. Then it discusses two classes of techniques for query evaluation on probabilistic databases. In extensional query evaluation, the entire probabilistic inference can be pushed into the database engine and, therefore, processed as effectively as the evaluation of standard SQL queries. The relational queries that can be evaluated this way are called safe queries. In intensional query evaluation, the probabilistic inference is performed over a propositional formula called lineage expression: every relational query can be evaluated this way, but the data complexity dramatically depends on the query being evaluated, and can be #P-hard. The book also discusses some advanced topics in probabilistic data management such as top-k query processing, sequential probabilistic databases, indexing and materialized views, and Monte Carlo databases. Table of Contents: Overview / Data and Query Model / The Query Evaluation Problem / Extensional Query Evaluation / Intensional Query Evaluation / Advanced Techniques

✦ Table of Contents


Preface: A Great Promise......Page 13
Acknowledgments......Page 17
Two Examples......Page 19
Possible Worlds Semantics......Page 23
Query Semantics......Page 24
Lineage......Page 25
Probabilistic Databases v.s. Graphical Models......Page 26
Safe Queries, Safe Query Plans, and the Dichotomy......Page 27
Applications of Probabilistic Databases......Page 28
Bibliographic and Historical Notes......Page 31
Background of the Relational Data Model......Page 35
The Probabilistic Data Model......Page 37
Query Semantics......Page 39
Queries: Possible Answers Semantics......Page 40
C-Tables and PC-Tables......Page 41
Lineage......Page 45
Properties of a Representation System......Page 47
Simple Probabilistic Database Design......Page 48
Tuple-independent Databases......Page 49
BID Databases......Page 53
U-Databases......Page 55
Bibliographic and Historical Notes......Page 59
The Complexity of P()......Page 63
The Complexity of P(Q)......Page 66
Bibliographic and Historical Notes......Page 69
Extensional Query Evaluation......Page 71
Query Independence......Page 73
Six Simple Rules for P(Q)......Page 74
Examples of Unsafe (Intractable) Queries......Page 79
Examples of Safe (Tractable) Queries......Page 80
The Möbius Function......Page 83
Completeness......Page 87
Extensional Operators......Page 93
An Algorithm for Safe Plans......Page 98
Extensional Plans for Unsafe Queries......Page 99
BID Tables......Page 102
Deterministic Tables......Page 104
Bibliographic and Historical Notes......Page 105
Intensional Query Evaluation......Page 109
Five Simple Rules for P()......Page 110
An Algorithm for P()......Page 114
Read-Once Formulas......Page 116
Compiling P()......Page 117
d-DNNF......Page 118
OBDD......Page 119
A deterministic approximation algorithm......Page 120
Monte Carlo Approximation......Page 122
Query Compilation......Page 126
Conjunctive Queries without Self-Joins......Page 127
Unions of Conjunctive Queries......Page 128
Discussion......Page 137
Bibliographic and Historical Notes......Page 138
Top-k Query Answering......Page 141
Computing the Set Topk......Page 142
Sequential Probabilistic Databases......Page 147
The MCDB Data Model......Page 152
Query Evaluation in MCDB......Page 153
Indexes for Probabilistic data......Page 155
Materialized Views for Relational Probabilistic Databases......Page 158
Conclusion......Page 161
Bibliography......Page 163
Authors' Biographies......Page 181


📜 SIMILAR VOLUMES


Probabilistic Ranking Techniques in Rela
✍ Ihab F. Ilyas, Mohamed A. Soliman 📂 Library 📅 2011 🏛 Morgan & Claypool 🌐 English

Ranking queries are widely used in data exploration, data analysis and decision making scenarios. While most of the currently proposed ranking techniques focus on deterministic data, several emerging applications involve data that are imprecise or uncertain. Ranking uncertain data raises new challen

Advances in Probabilistic Databases for
✍ John Grant, Francesco Parisi (auth.), Zongmin Ma, Li Yan (eds.) 📂 Library 📅 2013 🏛 Springer-Verlag Berlin Heidelberg 🌐 English

<p><p>This book covers a fast-growing topic in great depth and focuses on the technologies and applications of probabilistic data management. It aims to provide a single account of current studies in probabilistic data management. The objective of the book is to provide the state of the art informat

Refactoring Databases: Evolutionary Data
✍ Scott W. Ambler, Pramodkumar J. Sadalage 📂 Library 📅 2006 🏛 Addison-Wesley Professional 🌐 English

Refactoring has proven its value in a wide range of development projects-helping software professionals improve system designs, maintainability, extensibility, and performance. Now, for the first time, leading agile methodologist Scott Ambler and renowned consultant Pramodkumar Sadalage introduce po