𝔖 Bobbio Scriptorium
✦   LIBER   ✦

A note on approximation measures for multi-valued dependencies in relational databases

✍ Scribed by Chris Giannella; Edward Robertson


Publisher
Elsevier Science
Year
2003
Tongue
English
Weight
90 KB
Volume
85
Category
Article
ISSN
0020-0190

No coin nor oath required. For personal study only.

✦ Synopsis


We consider the problem of defining a normalized approximation measure for multi-valued dependencies in relational database theory. An approximation measure is a function mapping relation instances to real numbers. The number to which an instance is mapped, intuitively, describes the strength of the dependency in that instance. A normalized approximation measure for functional dependencies has been proposed previously: the minimum number of tuples that need be removed for the functional dependency to hold divided by the total number of tuples. This leads naturally to a normalized measure for multivalued dependencies: the minimum number of tuples that need be removed for the multi-valued dependency to hold divided by the total number of tuples.

The measure for functional dependencies can be computed efficiently, O(|r| log(|r|)) where |r| is the relation instance. However, we show that an efficient algorithm for computing the analogous measure for multi-valued dependencies is not likely to exist. A polynomial time algorithm for computing the measure would lead to a polynomial time algorithm for an NP-complete problem (proven by a reduction from the maximum edge biclique problem in graph theory). Hence, we argue that it is not a good measure. We propose an alternate measure based on the lossless join characterization of multi-valued dependencies. This measure is efficiently computable, O(|r| 2 ).