<p><P>Researchers in data management have recently recognized the importance of a new class of data-intensive applications that requires managing data streams, i.e., data composed of continuous, real-time sequence of items. Streaming applications pose new and interesting challenges for data manageme
Data Stream Management
β Scribed by Lukasz Golab, M.Tamer Γzsu
- Publisher
- Morgan & Claypool
- Year
- 2010
- Tongue
- English
- Leaves
- 73
- Series
- Synthesis Lectures on Data Management
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
In this lecture many applications process high volumes of streaming data, among them Internet traffic analysis, financial tickers, and transaction log mining. In general, a data stream is an unbounded data set that is produced incrementally over time, rather than being available in full before its processing begins. In this lecture, we give an overview of recent research in stream processing, ranging from answering simple queries on high-speed streams to loading real-time data feeds into a streaming warehouse for off-line analysis. We will discuss two types of systems for end-to-end stream processing: Data Stream Management Systems (DSMSs) and Streaming Data Warehouses (SDWs). A traditional database management system typically processes a stream of ad-hoc queries over relatively static data. In contrast, a DSMS evaluates static (long-running) queries on streaming data, making a single pass over the data and using limited working memory. In the first part of this lecture, we will discuss research problems in DSMSs, such as continuous query languages, non-blocking query operators that continually react to new data, and continuous query optimization. The second part covers SDWs, which combine the real-time response of a DSMS by loading new data as soon as they arrive with a data warehouse's ability to manage Terabytes of historical data on secondary storage. Table of Contents: Introduction / Data Stream Management Systems / Streaming Data Warehouses / Conclusions
β¦ Table of Contents
Overview of Data Stream Management......Page 9
Organization......Page 14
Stream Models......Page 17
Stream Windows......Page 19
Semantics and Algebras......Page 20
Operators......Page 21
Continuous Queries as Views......Page 25
Semantics of Relations in Continuous Queries......Page 26
Streams, Relations and Windows......Page 27
User-Defined Functions......Page 29
Summary......Page 30
Scheduling......Page 31
Heartbeats and Punctuations......Page 32
Processing Queries-As-Views and Negative Tuples......Page 34
Static Analysis and Query Rewriting......Page 37
Operator Optimization - Join......Page 38
Operator Optimization - Aggregation......Page 39
Multi-Query Optimization......Page 41
Load Shedding and Approximation......Page 42
Adaptive Query Optimization......Page 43
Distributed Query Optimization......Page 44
Data Extraction, Transformation and Loading......Page 47
Update Propagation......Page 48
Data Expiration......Page 50
Update Scheduling......Page 51
Querying a Streaming Data Warehouse......Page 53
Conclusions......Page 55
Bibliography......Page 57
Authors' Biographies......Page 73
π SIMILAR VOLUMES
Researchers in data management have recently recognized the importance of a new class of data-intensive applications that requires managing data streams, i.e., data composed of continuous, real-time sequence of items. Streaming applications pose new and interesting challenges for data management sys
<p><p>This volume focuses on the theory and practice of <i>data stream management</i>, and the novel challenges this emerging domain poses for data-management algorithms, systems, and applications. The collection of chapters, contributed by authorities in the field, offers a comprehensive introducti
<p><span>Researchers in data management have recently recognized the importance of a new class of data-intensive applications that requires managing data streams, i.e., data composed of continuous, real-time sequence of items. Streaming applications pose new and interesting challenges for data manag
<p>While traditional databases excel at complex queries over historical data, they are inherently pull-based and therefore ill-equipped to push new information to clients. Systems for data stream management and processing, on the other hand, are natively pushΒoriented and thus facilitate reactive be
<p><p>This book aims to provide some insights into recently developed bio-inspired algorithms within recent emerging trends of fog computing, sentiment analysis, and data streaming as well as to provide a more comprehensive approach to the big data management from pre-processing to analytics to visu