๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Azure Databricks Cookbook: Accelerate and scale real-time analytics solutions using the Apache Spark-based analytics service

โœ Scribed by Phani Raj, Vinod Jaiswal


Publisher
Packt Publishing
Tongue
English
Leaves
452
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


Get to grips with building and productionizing end-to-end big data solutions in Azure and learn best practices for working with large datasets

Key Features

  • Integrate with Azure Synapse Analytics, Cosmos DB, and Azure HDInsight Kafka Cluster to scale and analyze your projects and build pipelines
  • Use Databricks SQL to run ad hoc queries on your data lake and create dashboards
  • Productionize a solution using CI/CD for deploying notebooks and Azure Databricks Service to various environments

Book Description

Azure Databricks is a unified collaborative platform for performing scalable analytics in an interactive environment. The Azure Databricks Cookbook provides recipes to get hands-on with the analytics process, including ingesting data from various batch and streaming sources and building a modern data warehouse.

The book starts by teaching you how to create an Azure Databricks instance within the Azure portal, Azure CLI, and ARM templates. You'll work through clusters in Databricks and explore recipes for ingesting data from sources, including files, databases, and streaming sources such as Apache Kafka and EventHub. The book will help you explore all the features supported by Azure Databricks for building powerful end-to-end data pipelines. You'll also find out how to build a modern data warehouse by using Delta tables and Azure Synapse Analytics. Later, you'll learn how to write ad hoc queries and extract meaningful insights from the data lake by creating visualizations and dashboards with Databricks SQL. Finally, you'll deploy and productionize a data pipeline as well as deploy notebooks and Azure Databricks service using continuous integration and continuous delivery (CI/CD).

By the end of this Azure book, you'll be able to use Azure Databricks to streamline different processes involved in building data-driven apps.

What you will learn

  • Read and write data from and to various Azure resources and file formats
  • Build a modern data warehouse with Delta Tables and Azure Synapse Analytics
  • Explore jobs, stages, and tasks and see how Spark lazy evaluation works
  • Handle concurrent transactions and learn performance optimization in Delta tables
  • Learn Databricks SQL and create real-time dashboards in Databricks SQL
  • Integrate Azure DevOps for version control, deploying, and productionizing solutions with CI/CD pipelines
  • Discover how to use RBAC and ACLs to restrict data access
  • Build end-to-end data processing pipeline for near real-time data analytics

Who this book is for

This recipe-based book is for data scientists, data engineers, big data professionals, and machine learning engineers who want to perform data analytics on their applications. Prior experience of working with Apache Spark and Azure is necessary to get the most out of this book.

Table of Contents

  1. Creating an Azure Databricks Service
  2. Reading and Writing Data from and to Various Azure Services and File Formats
  3. Understanding Spark Query Execution
  4. Working with Streaming Data
  5. Integrating with Azure Key-Vault, App Configuration and Log Analytics
  6. Exploring Delta Lake in Azure Databricks
  7. Implementing Near-Real-Time Analytics and Building Modern Data Warehouse
  8. Databricks SQL
  9. DevOps Integrations and Implementing CI/CD for Azure Databricks
  10. Understanding Security and Monitoring in Azure Databricks

๐Ÿ“œ SIMILAR VOLUMES


Azure Databricks Cookbook: Accelerate an
โœ Phani Raj, Vinod Jaiswal ๐Ÿ“‚ Library ๐Ÿ“… 2021 ๐Ÿ› Packt Publishing ๐ŸŒ English

<p><b>Get to grips with building and productionizing end-to-end big data solutions in Azure and learn best practices for working with large datasets</b></p><h4>Key Features</h4><ul><li>Integrate with Azure Synapse Analytics, Cosmos DB, and Azure HDInsight Kafka Cluster to scale and analyze your proj

Beginning Apache Spark Using Azure Datab
โœ Robert Ilijason ๐Ÿ“‚ Library ๐Ÿ“… 2020 ๐Ÿ› Apress ๐ŸŒ English

<p>Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere frac

Beginning Apache Spark Using Azure Datab
โœ Robert Ilijason ๐Ÿ“‚ Library ๐Ÿ“… 2020 ๐Ÿ› Apress ๐ŸŒ English

<p>Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere frac

Beginning Apache Spark Using Azure Datab
โœ Robert Ilijason ๐Ÿ“‚ Library ๐Ÿ“… 2020 ๐Ÿ› Apress ๐ŸŒ English

<p><p>Analyze vast amounts of data in record time using Apache Spark with Databricks in the Cloud. Learn the fundamentals, and more, of running analytics on large clusters in Azure and AWS, using Apache Spark with Databricks on top. Discover how to squeeze the most value out of your data at a mere f

Data Engineering with Databricks Cookboo
โœ Pulkit Chadha ๐Ÿ“‚ Library ๐Ÿ“… 2024 ๐Ÿ› Packt Publishing ๐ŸŒ English

Work through 70 recipes for implementing reliable data pipelines with Apache Spark, optimally store and process structured and unstructured data in Delta Lake, and use Databricks to orchestrate and govern your data Key Features Learn data ingestion, data transformation, and data management techniqu

Pro Spark Streaming: The Zen of Real-Tim
โœ Zubair Nabi (auth.) ๐Ÿ“‚ Library ๐Ÿ“… 2016 ๐Ÿ› Apress ๐ŸŒ English

Learn the right cutting-edge skills and knowledge to leverage Spark Streaming to implement a wide array of real-time, streaming applications. Pro Spark Streaming walks you through end-to-end real-time application development using real-world applications, data, and code. Taking an application-first