๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Hands-On Big Data Analytics with PySpark: Analyze large datasets and discover techniques for testing, immunizing, and parallelizing Spark jobs

โœ Scribed by Lai, Rudy;Potaczek, Bartlomiej


Publisher
Packt Publishing
Year
2019
Tongue
English
Leaves
182
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


Use PySpark to easily crush messy data at-scale and discover proven techniques to create testable, immutable, and easily parallelizable Spark jobs

Key Features

  • Work with large amounts of agile data using distributed datasets and in-memory caching
  • Source data from all popular data hosting platforms, such as HDFS, Hive, JSON, and S3
  • Employ the easy-to-use PySpark API to deploy big data Analytics for production

    Book Description

    Apache Spark is an open source parallel-processing framework that has been around for quite some time now. One of the many uses of Apache Spark is for data analytics applications across clustered computers. In this book, you will not only learn how to use Spark and the Python API to create high-performance analytics with big data, but also discover techniques for testing, immunizing, and parallelizing Spark jobs.

    You will learn how to source data from all popular data hosting platforms, including HDFS, Hive, JSON, and...

  • โœฆ Table of Contents


    Table of ContentsInstalling Pyspark and Setting up Your Development EnvironmentGetting Your Big Data into the Spark Environment Using RDDsBig Data Cleaning and Wrangling with Spark NotebooksAggregating and Summarizing Data into Useful ReportsPowerful Exploratory Data Analysis with MLlibPutting Structure on Your Big Data with SparkSQLTransformations and ActionsImmutable DesignAvoiding Shuffle and Reducing Operational ExpensesSaving Data in the Correct FormatWorking with the Spark Key/Value APITesting Apache Spark JobsLeveraging the Spark GraphX API


    ๐Ÿ“œ SIMILAR VOLUMES


    Big Data Analytics with Spark: A Practit
    โœ Mohammed Guller ๐Ÿ“‚ Library ๐Ÿ“… 2015 ๐Ÿ› Apress ๐ŸŒ English

    <p><em>Big Data Analytics with Spark</em> is a step-by-step guide for learning Spark, which is an open-source fast and general-purpose cluster computing framework for large-scale data analysis. You will learn how to use Spark for different types of big data analytics projects, including batch, inter

    Big Data Analytics with Spark: A Practit
    โœ Mohammed Guller ๐Ÿ“‚ Library ๐Ÿ“… 2016 ๐Ÿ› Apress ๐ŸŒ English

    This book is a step-by-step guide for learning how to use Spark for different types of big-data analytics projects, including batch, interactive, graph, and stream data analysis as well as machine learning. It covers Spark core and its add-on libraries, including Spark SQL, Spark Streaming, GraphX,

    Data Science and Big Data Analytics: Dis
    โœ EMC Education Services [EMC Education Services] ๐Ÿ“‚ Library ๐Ÿ“… 2015 ๐Ÿ› John Wiley & Sons ๐ŸŒ English

    <span><span><p><em>Data Science and Big Data Analytics</em> is about harnessing the power of data for new insights. The book covers the breadth of activities and methods and tools that Data Scientists use. The content focuses on concepts, principles and practical applications that are applicable to

    Data Science and Big Data Analytics: Dis
    โœ EMC Education Services ๐Ÿ“‚ Library ๐Ÿ“… 2015 ๐Ÿ› Wiley ๐ŸŒ English

    <i>Data Science and Big Data Analytics</i> is about harnessing the power of data for new insights. The book covers the breadth of activities and methods and tools that Data Scientists use. The content focuses on concepts, principles and practical applications that are applicable to any industry and

    Practical Big Data Analytics: Hands-on t
    โœ DasGupta, Nataraj ๐Ÿ“‚ Library ๐Ÿ“… 2018 ๐Ÿ› Packt Publishing Limited ๐ŸŒ English

    Get command of your organizational Big Data using the power of data science and analyticsKey Features A perfect companion to boost your Big Data storing, processing, analyzing skills to help you take informed business decisions Work with the best tools such as Apache Hadoop, R, Python, and Spark for

    Practical Big Data Analytics: Hands-on t
    โœ DasGupta, Nataraj ๐Ÿ“‚ Library ๐Ÿ“… 2018 ๐Ÿ› Packt Publishing Limited ๐ŸŒ English

    Get command of your organizational Big Data using the power of data science and analyticsKey Features A perfect companion to boost your Big Data storing, processing, analyzing skills to help you take informed business decisions Work with the best tools such as Apache Hadoop, R, Python, and Spark for