✦ LIBER ✦

Optimal aggregation algorithms for middleware

✍ Scribed by Ronald Fagin; Amnon Lotem; Moni Naor

Publisher: Elsevier Science
Year: 2003
Tongue: English
Weight: 427 KB
Volume: 66
Category: Article
ISSN: 0022-0000
DOI: 10.1016/s0022-0000(03)00026-6

No coin nor oath required. For personal study only.

✦ Synopsis

Assume that each object in a database has m grades, or scores, one for each of m attributes. For example, an object can have a color grade, that tells how red it is, and a shape grade, that tells how round it is. For each attribute, there is a sorted list, which lists each object and its grade under that attribute, sorted by grade (highest grade first). Each object is assigned an overall grade, that is obtained by combining the attribute grades using a fixed monotone aggregation function, or combining rule, such as min or average. To determine the top k objects, that is, k objects with the highest overall grades, the naive algorithm must access every object in the database, to find its grade under each attribute. Fagin has given an algorithm (''Fagin's Algorithm'', or FA) that is much more efficient. For some monotone aggregation functions, FA is optimal with high probability in the worst case. We analyze an elegant and remarkably simple algorithm (''the threshold algorithm'', or TA) that is optimal in a much stronger sense than FA. We show that TA is essentially optimal, not just for some monotone aggregation functions, but for all of them, and not just in a high-probability worst-case sense, but over every database. Unlike FA, which requires large buffers (whose size may grow unboundedly as the database size grows), TA requires only a small, constant-size buffer. TA allows early stopping, which yields, in a precise sense, an approximate version of the top k answers. We distinguish two types of access: sorted access (where the middleware system obtains the grade of an object in some sorted list by proceeding through the list sequentially from the top), and random access (where the middleware system requests the grade of object in a list, and obtains it in one step). We consider the scenarios where random access is either impossible, or expensive relative to sorted access, and provide algorithms that are essentially optimal for these cases as well.

📜 SIMILAR VOLUMES

Optimal choice of modes for aggregation

✍ C. Commault 📂 Article 📅 1981 🏛 Elsevier Science 🌐 English ⚖ 238 KB

Optimal Algorithms for Constrained Recon

Optimal Algorithms for Constrained Reconfigurable Meshes

✍ Bryan Beresford-Smith; Oliver Diessel; Hossam ElGindy 📂 Article 📅 1996 🏛 Elsevier Science 🌐 English ⚖ 220 KB

model the propagation delay on a bus-unit 1 by a constant, and to only permit the class of algorithms, denoted by A k , which configure bus components bound in size to at most k bus-units to run on the model. We give a detailed description of our reconfigurable mesh model in the following section.

Optimal Parallel Algorithms for Quadtree

Optimal Parallel Algorithms for Quadtree Problems

✍ S. Kasif 📂 Article 📅 1994 🏛 Elsevier Science ⚖ 449 KB

In this paper we describe optimal processor-time parallel algorithms for set operations such as union, intersection, comparison on quadtrees. The algorithms presented in this paper run in \(O(\log\) \(N\) ) time using \(N / \log N\) processors on a shared memory model of computation that allows conc

Optimal expected-time algorithms for mer

Optimal expected-time algorithms for merging

✍ Mai Thanh; V.S. Alagar; T.D. Bui 📂 Article 📅 1986 🏛 Elsevier Science 🌐 English ⚖ 732 KB

Optimal mutation probability for genetic

Optimal mutation probability for genetic algorithms

✍ R.N. Greenwell; J.E. Angus; M. Finck 📂 Article 📅 1995 🏛 Elsevier Science 🌐 English ⚖ 804 KB

Optimal schedules for monitoring anytime

Optimal schedules for monitoring anytime algorithms

✍ Lev Finkelstein; Shaul Markovitch 📂 Article 📅 2001 🏛 Elsevier Science 🌐 English ⚖ 378 KB

Monitoring anytime algorithms can significantly improve their performance. This work deals with the problem of off-line construction of monitoring schedules. We study a model where queries are submitted to the monitored process in order to detect satisfaction of a given goal predicate. The queries c