๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Practical Real-time Data Processing and Analytics: Distributed Computing and Event Processing using Apache Spark, Flink, Storm, and Kafka

โœ Scribed by Shilpi Saxena, Saurabh Gupta


Publisher
Packt Publishing - ebooks Account
Year
2017
Tongue
English
Leaves
422
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


A practical guide to help you tackle different real-time data processing and analytics problems using the best tools for each scenario

About This Book

  • Learn about the various challenges in real-time data processing and use the right tools to overcome them
  • This book covers popular tools and frameworks such as Spark, Flink, and Apache Storm to solve all your distributed processing problems
  • A practical guide filled with examples, tips, and tricks to help you perform efficient Big Data processing in real-time

Who This Book Is For

If you are a Java developer who would like to be equipped with all the tools required to devise an end-to-end practical solution on real-time data streaming, then this book is for you. Basic knowledge of real-time processing would be helpful, and knowing the fundamentals of Maven, Shell, and Eclipse would be great.

What You Will Learn

  • Get an introduction to the established real-time stack
  • Understand the key integration of all the components
  • Get a thorough understanding of the basic building blocks for real-time solution designing
  • Garnish the search and visualization aspects for your real-time solution
  • Get conceptually and practically acquainted with real-time analytics
  • Be well equipped to apply the knowledge and create your own solutions

In Detail

With the rise of Big Data, there is an increasing need to process large amounts of data continuously, with a shorter turnaround time. Real-time data processing involves continuous input, processing and output of data, with the condition that the time required for processing is as short as possible.

This book covers the majority of the existing and evolving open source technology stack for real-time processing and analytics. You will get to know about all the real-time solution aspects, from the source to the presentation to persistence. Through this practical book, you'll be equipped with a clear understanding of how to solve challenges on your own.

We'll cover topics such as how to set up components, basic executions, integrations, advanced use cases, alerts, and monitoring. You'll be exposed to the popular tools used in real-time processing today such as Apache Spark, Apache Flink, and Storm. Finally, you will put your knowledge to practical use by implementing all of the techniques in the form of a practical, real-world use case.

By the end of this book, you will have a solid understanding of all the aspects of real-time data processing and analytics, and will know how to deploy the solutions in production environments in the best possible manner.

Style and Approach

In this practical guide to real-time analytics, each chapter begins with a basic high-level concept of the topic, followed by a practical, hands-on implementation of each concept, where you can see the working and execution of it. The book is written in a DIY style, with plenty of practical use cases, well-explained code examples, and relevant screenshots and diagrams.

โœฆ Subjects


Data Modeling & Design;Databases & Big Data;Computers & Technology;Data Mining;Databases & Big Data;Computers & Technology;Data Processing;Databases & Big Data;Computers & Technology


๐Ÿ“œ SIMILAR VOLUMES


Apache Spark 2: Data Processing and Real
โœ Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, Meenakshi Raj ๐Ÿ“‚ Library ๐Ÿ“… 2018 ๐Ÿ› Packt Publishing ๐ŸŒ English

<p><b>Build efficient data flow and machine learning programs with this flexible, multi-functional open-source cluster-computing framework</b></p> <h4>Key Features</h4> <ul><li>Master the art of real-time big data processing and machine learning </li> <li>Explore a wide range of use-cases to analyze

Storm Blueprints: Patterns for Distribut
โœ P. Taylor Goetz, Brian ๐Ÿ“‚ Library ๐Ÿ“… 2014 ๐Ÿ› Packt Publishing ๐ŸŒ English

Storm is the most popular framework for real-time stream processing. Storm provides the fundamental primitives and guarantees required for fault-tolerant distributed computing in high-volume, mission critical applications. It is both an integration technology as well as a data flow and control mecha

Fast Data Processing with Spark, 2nd Edi
โœ Krishna Sankar, Holden Karau ๐Ÿ“‚ Library ๐Ÿ“… 2015 ๐Ÿ› Packt Publishing ๐ŸŒ English

Spark is a framework used for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does, but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and built-in tools for interactive query analysis (Spark SQL), larg

Apache Kafka Quick Start Guide: Leverage
โœ Raul Estrada ๐Ÿ“‚ Library ๐Ÿ“… 2018 ๐Ÿ› Packt Publishing ๐ŸŒ English

<p><b>Process large volumes of data in real-time while building high performance and robust data stream processing pipeline using the latest Apache Kafka 2.0</b></p> Key Features <li>Solve practical large data and processing challenges with Kafka </li> <li>Tackle data processing challenges like late

Kafka: Real-Time Data and Stream Process
โœ Narkhede, Neha;Palino, Todd;Shapira, Gwen ๐Ÿ“‚ Library ๐Ÿ“… 2017 ๐Ÿ› O'Reilly Media, Incorporated ๐ŸŒ English

Table of Contents; Foreword; Preface; Who Should Read This Book; Conventions Used in This Book; Using Code Examples; O'Reilly Safari; How to Contact Us; Acknowledgments; Chapter 1. Meet Kafka; Publish/Subscribe Messaging; How It Starts; Individual Queue Systems; Enter Kafka; Messages and Batches; Sc