๐”– Bobbio Scriptorium
โœฆ   LIBER   โœฆ

Finding frequent items in data streams

โœ Scribed by Moses Charikar; Kevin Chen; Martin Farach-Colton


Publisher
Elsevier Science
Year
2004
Tongue
English
Weight
232 KB
Volume
312
Category
Article
ISSN
0304-3975

No coin nor oath required. For personal study only.

โœฆ Synopsis


We present a 1-pass algorithm for estimating the most frequent items in a data stream using limited storage space. Our method relies on a data structure called a COUNT SKETCH, which allows us to reliably estimate the frequencies of frequent items in the stream. Our algorithm achieves better space bounds than the previously known best algorithms for this problem for several natural distributions on the item frequencies. In addition, our algorithm leads directly to a 2-pass algorithm for the problem of estimating the items with the largest (absolute) change in frequency between two data streams. To our knowledge, this latter problem has not been previously studied in the literature.


๐Ÿ“œ SIMILAR VOLUMES


Finding frequent items in parallel
โœ Massimo Cafaro; Piergiulio Tempesta ๐Ÿ“‚ Article ๐Ÿ“… 2011 ๐Ÿ› John Wiley and Sons ๐ŸŒ English โš– 150 KB
Mining evolving data streams for frequen
โœ Pierre-Alain Laur; Richard Nock; Jean-Emile Symphor; Pascal Poncelet ๐Ÿ“‚ Article ๐Ÿ“… 2007 ๐Ÿ› Elsevier Science ๐ŸŒ English โš– 331 KB