๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Big Data Analytics: A Hands-On Approach

โœ Scribed by Arshdeep Bahga, Vijay Madisetti


Publisher
Arshdeep Bahga & Vijay Madisetti
Year
2019
Tongue
English
Leaves
542
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


The book is organized into three main parts, comprising a total of twelve chapters. Part I provides an introduction to big data, applications of big data, and big data science and analytics patterns and architectures. A novel data science and analytics application system design methodology is proposed and its realization through use of open-source big data frameworks is described. This methodology describes big data analytics applications as realization of the proposed Alpha, Beta, Gamma and Delta models, that comprise tools and frameworks for collecting and ingesting data from various sources into the big data analytics infrastructure, distributed filesystems and non-relational (NoSQL) databases for data storage, processing frameworks for batch and real-time analytics, serving databases, web and visualization frameworks. This new methodology forms the pedagogical foundation of this book.
Part II introduces the reader to various tools and frameworks for big data analytics, and the architectural and programming aspects of these frameworks as used in the proposed design methodology. We chose Python as the primary programming language for this book. Other languages, besides Python, may also be easily used within the Big Data stack described in this book. We describe tools and frameworks for Data Acquisition including Publish-subscribe messaging frameworks such as Apache Kafka and Amazon Kinesis, Source-Sink connectors such as Apache Flume, Database Connectors such as Apache Sqoop, Messaging Queues such as RabbitMQ, ZeroMQ, RestMQ, Amazon SQS and custom REST-based connectors and WebSocket-based connectors. The reader is introduced to Hadoop Distributed File System (HDFS) and HBase non-relational database. The batch analysis chapter provides an in-depth study of frameworks such as Hadoop-MapReduce, Pig, Oozie, Spark and Solr. The real-time analysis chapter focuses on Apache Storm and Spark Streaming frameworks. In the chapter on interactive querying, we describe with the help of examples, the use of frameworks and services such as Spark SQL, Hive, Amazon Redshift and Google BigQuery. The chapter on serving databases and web frameworks provide an introduction to popular relational and non-relational databases (such as MySQL, Amazon DynamoDB, Cassandra, and MongoDB) and the Django Python web framework.
Part III focuses advanced topics on big data including analytics algorithms and data visualization tools. The chapter on analytics algorithms introduces the reader to machine learning algorithms for clustering, classification, regression and recommendation systems, with examples using the Spark MLlib and H2O frameworks. The chapter on data visualization describes examples of creating various types of visualizations using frameworks such as Lightning, pygal and Seaborn.

โœฆ Table of Contents


1 Introduction to Big Data
2 Setting up Big Data Stack
3 Big Data Patterns
4 NoSQL
5 Data Acquisition
6 Big Data Storage
7 Batch Analysis
8 Real-time Analysis
9 Interactive Querying
10 Serving Databases & Web Frameworks
11 Analytics Algorithms
12 Data Visualization

โœฆ Subjects


Big Data


๐Ÿ“œ SIMILAR VOLUMES


Big Data Analytics: A Social Network App
โœ Mrutyunjaya Panda, Aboul-Ella Hassanien, Ajith Abraham ๐Ÿ“‚ Library ๐Ÿ“… 2018 ๐Ÿ› CRC Press ๐ŸŒ English

Social networking has increased drastically in recent years, resulting in an increased amount of data being created daily. Furthermore, diversity of issues and complexity of the social networks pose a challenge in social network mining. Traditional algorithm software cannot deal with such complex an

Healthcare Big Data Analytics: Computati
โœ Akash Kumar Bhoi (editor), Ranjit Panigrahi (editor), Victor Hugo C. de Albuque ๐Ÿ“‚ Library ๐Ÿ“… 2024 ๐Ÿ› Walter de Gruyter ๐ŸŒ English

<p><span>This book highlights how optimized big data applications can be used for patient monitoring and clinical diagnosis. In fact, IoT-based applications are data-driven and mostly employ modern optimization techniques. The book also explores challenges, opportunities, and future research directi

Handbook On Big Data Analytics
โœ Tajunisha N., Sruthika P. ๐Ÿ“‚ Library ๐ŸŒ English

Amazon Digital Services LLC, 2016. โ€” 54 p. โ€” ASIN: B01DE10HAO<div class="bb-sep"></div>Today we live in the world of internet of things. With increased digitization there has been an unprecedented increase in the quantity and variety of data generated worldwide. The enterprise does not know what to

Handbook On Big Data Analytics
โœ Tajunisha N., Sruthika P. ๐Ÿ“‚ Library ๐ŸŒ English

Amazon Digital Services LLC, 2016. โ€” 61 p. โ€” ASIN: B01DE10HAO<div class="bb-sep"></div>Today we live in the world of internet of things. With increased digitization there has been an unprecedented increase in the quantity and variety of data generated worldwide. The enterprise does not know what to

Smart Grid using Big Data Analytics. A
โœ Robert C. Qiu, Paul Antonik ๐Ÿ“‚ Library ๐Ÿ“… 2017 ๐Ÿ› Wiley ๐ŸŒ English

This book is aimed at students in communications and signal processing who want to extend their skills in the energy area. It describes power systems and why these backgrounds are so useful to smart grid, wireless communications being very different to traditional wireline communications.

Data Storytelling and Visualization with
โœ Prachi Manoj Joshi, Parikshit Narendra Mahalle ๐Ÿ“‚ Library ๐Ÿ“… 2022 ๐Ÿ› CRC Press ๐ŸŒ English

<p><span>With the tremendous growth and availability of the data, this book covers understanding the data, while telling a story with visualization including basic concepts about the data, the relationship and the visualizations. All the technical details that include installation and building the d