𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Data Analytics with Spark Using Python

✍ Scribed by Jeffrey Aven


Publisher
Addison-Wesley
Year
2018
Tongue
English
Series
Addison-Wesley Data & Analytics Series
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Solve Data Analytics Problems with Spark, PySpark, and Related Open Source Tools

Spark is at the heart of today’s Big Data revolution, helping data professionals supercharge efficiency and performance in a wide range of data processing and analytics tasks. In this guide, Big Data expert Jeffrey Aven covers all you need to know to leverage Spark, together with its extensions, subprojects, and wider ecosystem.

Aven combines a language-agnostic introduction to foundational Spark concepts with extensive programming examples utilizing the popular and intuitive PySpark development environment. This guide’s focus on Python makes it widely accessible to large audiences of data professionals, analysts, and developersβ€”even those with little Hadoop or Spark experience.

Aven’s broad coverage ranges from basic to advanced Spark programming, and Spark SQL to machine learning. You’ll learn how to efficiently manage all forms of data with Spark: streaming, structured, semi-structured, and unstructured. Throughout, concise topic overviews quickly get you up to speed, and extensive hands-on exercises prepare you to solve real problems.

Coverage includes:

  • Understand Spark’s evolving role in the Big Data and Hadoop ecosystems
  • Create Spark clusters using various deployment modes
  • Control and optimize the operation of Spark clusters and applications
  • Master Spark Core RDD API programming techniques
  • Extend, accelerate, and optimize Spark routines with advanced API platform constructs, including shared variables, RDD storage, and partitioning
  • Efficiently integrate Spark with both SQL and nonrelational data stores
  • Perform stream processing and messaging with Spark Streaming and Apache Kafka
  • Implement predictive modeling with SparkR and Spark MLlib

✦ Subjects


Computers; Databases; Data Mining; Web; General; Mathematics


πŸ“œ SIMILAR VOLUMES


Big Data Analytics with Spark: A Practit
✍ Mohammed Guller πŸ“‚ Library πŸ“… 2015 πŸ› Apress 🌐 English

<p><em>Big Data Analytics with Spark</em> is a step-by-step guide for learning Spark, which is an open-source fast and general-purpose cluster computing framework for large-scale data analysis. You will learn how to use Spark for different types of big data analytics projects, including batch, inter

Big Data Analytics with Spark: A Practit
✍ Mohammed Guller πŸ“‚ Library πŸ“… 2016 πŸ› Apress 🌐 English

This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX,

Advanced Analytics with PySpark: Pattern
✍ Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills πŸ“‚ Library πŸ“… 2022 πŸ› O'Reilly Media 🌐 English

<p><span>The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-worl

Advanced Analytics with PySpark: Pattern
✍ Akash Tandon, Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills πŸ“‚ Library πŸ“… 2022 πŸ› O'Reilly Media 🌐 English

The amount of data being generated today is staggering and growing. Apache Spark has emerged as the de facto tool to analyze big data and is now a critical part of the data science toolbox. Updated for Spark 3.0, this practical guide brings together Spark, statistical methods, and real-world dataset

IoT Data Analytics using Python: Learn h
✍ M. S. Hariharan πŸ“‚ Library πŸ“… 2023 πŸ› BPB Online 🌐 English

Python is a popular programming language for data analytics, and it is also well-suited for IoT Data Analytics. By leveraging Python's versatility and its rich ecosystem of libraries and tools, Data Analytics for IoT can unlock valuable insights, enable predictive capabilities, and optimize decision