Style - Font Definitions -face font-family: Wingdings; panose-1:5 0 0 0 0 0 0 0 0 0; mso-font-charset:2; mso-generic-font-family:auto; mso-font-pitch:variable; mso-font-signature:0 268435456 0 0 -2147483648 0 Style Definitions p. MsoNormal, li. MsoNormal, div. MsoNormal mso-style-parent:'; margin:0i
Real-time analytics: techniques to analyze and visualize streaming data
β Scribed by Ellis, Byron
- Publisher
- Wiley
- Year
- 2014
- Tongue
- English
- Leaves
- 435
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Construct a robust end-to-end solution for analyzing and visualizing streaming dataReal-time analytics is the hottest topic in data analytics today. InReal-Time Analytics: Techniques to Analyze and Visualize Streaming Data, expert Byron Ellis teaches data analysts technologies to build an effective real-time analytics platform. This platform can then be used to make sense of the constantly changing data that is beginning to outpace traditional batch-based analysis platforms.
The author is among a very few leading experts in the field. He has a prestigious background in research, development, analytics, real-time visualization, and Big Data streaming and is uniquely qualified to help you explore this revolutionary field. Moving from a description of the overall analytic architecture of real-time analytics to using specific tools to obtain targeted results,Real-Time Analyticsleverages open source and modern commercial tools to construct robust, efficient systems that can provide real-time analysis in a cost-effective manner. The book includes:
A deep discussion of streaming data systems and architectures Instructions for analyzing, storing, and delivering streaming data Tips on aggregating data and working with sets Information on data warehousing options and techniquesReal-Time Analyticsincludes in-depth case studies for website analytics, Big Data, visualizing streaming and mobile data, and mining and visualizing operational data flows. The book's "recipe" layout lets readers quickly learn and implement different techniques. All of the code examples presented in the book, along with their related data sets, are available on the companion website.
β¦ Table of Contents
Cover......Page 1
Title Page......Page 3
Copyright......Page 4
Contents......Page 11
Chapter 1 Introduction to Streaming Data......Page 23
Sources of Streaming Data......Page 24
Web Analytics......Page 25
Online Advertising......Page 26
Mobile Data and the Internet of Things......Page 27
Always On, Always Flowing......Page 29
Loosely Structured......Page 30
High-Cardinality Storage......Page 31
Conclusion......Page 32
Part I Streaming Analytics Architecture......Page 35
Chapter 2 Designing Real-Time Streaming Architectures......Page 37
Collection......Page 38
Data Flow......Page 39
Processing......Page 41
Storage......Page 42
Delivery......Page 44
High Availability......Page 46
Low Latency......Page 47
Horizontal Scalability......Page 48
Java......Page 49
Scala and Clojure......Page 50
JavaScript......Page 51
A Real-Time Architecture Checklist......Page 52
Data Flow......Page 53
Storage......Page 54
Delivery......Page 55
Conclusion......Page 56
Chapter 3 Service Configuration and Coordination......Page 57
Unreliable Network Connections......Page 58
Clock Synchronization......Page 59
Consensus in an Unreliable World......Page 60
The znode......Page 61
Maintaining Consistency......Page 63
Creating a ZooKeeper Cluster......Page 64
ZooKeeperβs Native Java Client......Page 69
The Curator Client......Page 78
Curator Recipes......Page 85
Conclusion......Page 92
Chapter 4 Data-Flow Management in Streaming Analysis......Page 93
At Least Once Delivery......Page 94
The βn+1β Problem......Page 95
Design and Implementation......Page 96
Configuring a Kafka Environment......Page 102
Interacting with Kafka Brokers......Page 111
The Flume Agent......Page 114
Configuring the Agent......Page 116
Channel Selectors......Page 117
Flume Sources......Page 120
Flume Sinks......Page 129
Flume Channels......Page 132
Flume Interceptors......Page 134
Running Flume Agents......Page 136
Conclusion......Page 137
Chapter 5 Processing Streaming Data......Page 139
Coordination......Page 140
Processing Data with Storm......Page 141
Components of a Storm Cluster......Page 142
Configuring a Storm Cluster......Page 144
Distributed Clusters......Page 145
Local Clusters......Page 148
Storm Topologies......Page 149
Implementing Bolts......Page 152
Implementing and Using Spouts......Page 158
Distributed Remote Procedure Calls......Page 164
Trident: The Storm DSL......Page 166
Apache YARN......Page 173
Getting Started with YARN and Samza......Page 175
Samza Jobs......Page 179
Conclusion......Page 188
Chapter 6 Storing Streaming Data......Page 189
Consistent Hashing......Page 190
βNoSQLβ Storage Systems......Page 191
Redis......Page 192
MongoDB......Page 202
Cassandra......Page 225
Choosing a Technology......Page 237
Distributed Hash Table Stores......Page 238
Warehousing......Page 239
Hadoop as ETL and Warehouse......Page 240
Lambda Architectures......Page 245
Conclusion......Page 246
Part II Analysis and Visualization......Page 247
Chapter 7 Delivering Streaming Metrics......Page 249
Streaming Web Applications......Page 250
Working with Node......Page 251
Managing a Node Project with NPM......Page 253
Developing Node Web Applications......Page 257
A Basic Streaming Dashboard......Page 260
Adding Streaming to Web Applications......Page 264
HTML5 Canvas and Inline SVG......Page 276
Data-Driven Documents: D3.js......Page 284
High-Level Tools......Page 294
Mobile Streaming Applications......Page 299
Conclusion......Page 301
Chapter 8 Exact Aggregation and Delivery......Page 303
Timed Counting and Summation......Page 307
Counting in Bolts......Page 308
Counting with Trident......Page 310
Counting in Samza......Page 311
Quantization Framework......Page 312
Stochastic Optimization......Page 318
Delivering Time-Series Data......Page 319
Strip Charts with D3.js......Page 320
High-Speed Canvas Charts......Page 321
Horizon Charts......Page 323
Conclusion......Page 325
Chapter 9 Statistical Approximation of Streaming Data......Page 327
Numerical Libraries......Page 328
Probabilities and Distributions......Page 329
Expectation and Variance......Page 331
Discrete Distributions......Page 332
Continuous Distributions......Page 334
Joint Distributions......Page 337
Inferring Parameters......Page 338
The Delta Method......Page 339
Random Number Generation......Page 341
Generating Specific Distributions......Page 343
Sampling Procedures......Page 346
Sampling from a Fixed Population......Page 347
Sampling from a Streaming Population......Page 348
Biased Streaming Sampling......Page 349
Conclusion......Page 351
Chapter 10 Approximating Streaming Data with Sketching......Page 353
Hash Functions......Page 354
Working with Sets......Page 358
The Algorithm......Page 360
Choosing a Filter Size......Page 362
Unions and Intersections......Page 363
Cardinality Estimation......Page 364
Interesting Variations......Page 366
Distinct Value Sketches......Page 369
The Min-Count Algorithm......Page 370
The HyperLogLog Algorithm......Page 373
Point Queries......Page 378
Count-Min Sketch Implementation......Page 379
Top-K and "Heavy Hitters"......Page 380
Range and Quantile Queries......Page 382
Conclusion......Page 386
Chapter 11 Beyond Aggregation......Page 389
Models for Real-Time Data......Page 390
Simple Time-Series Models......Page 391
Linear Models......Page 395
Logistic Regression......Page 400
Neural Network Models......Page 402
Forecasting with Models......Page 411
Exponential Smoothing Methods......Page 412
Regression Methods......Page 415
Neural Network Methods......Page 416
Monitoring......Page 418
Outlier Detection......Page 419
Change Detection......Page 421
Real-Time Optimization......Page 422
Conclusion......Page 424
Index......Page 425
EULA......Page 435
β¦ Subjects
Computer Science;Programming;Nonfiction
π SIMILAR VOLUMES
<b>Construct a robust end-to-end solution for analyzing and visualizing streaming data</b><p>Real-time analytics is the hottest topic in data analytics today. In <i>Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data</i>, expert Byron Ellis teaches data analysts technologies to b
Construct a robust endβtoβend solution for analyzing and visualizing streaming data Realβtime analytics is the hottest topic in data analytics today. In RealβTime Analytics: Techniques to Analyze and Visualize Streaming Data , expert Byron Ellis teaches data analysts technologies to build an effecti
Construct a robust endβtoβend solution for analyzing and visualizing streaming data Realβtime analytics is the hottest topic in data analytics today. In RealβTime Analytics: Techniques to Analyze and Visualize Streaming Data , expert Byron Ellis teaches data analysts technologies to build an effecti
Real-time analytics is the hottest topic in data analytics today. In Real-Time Analytics - expert Byron Ellis teaches data analysts technologies to build an effective real-time analytics platform. This platform can then be used to make sense of the constantly changing data that is beginning to outpa