๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Simplifying Data Engineering and Analytics with Delta: Create analytics-ready data that fuels artificial intelligence and business intelligence

โœ Scribed by Anindita Mahapatra


Publisher
Packt Publishing
Year
2022
Tongue
English
Leaves
334
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


Explore how Delta brings reliability, performance, and governance to your data lake and all the AI and BI use cases built on top of it

Key Features

  • Learn Delta's core concepts and features as well as what makes it a perfect match for data engineering and analysis
  • Solve business challenges of different industry verticals using a scenario-based approach
  • Make optimal choices by understanding the various tradeoffs provided by Delta

Book Description

Delta helps you generate reliable insights at scale and simplifies architecture around data pipelines, allowing you to focus primarily on refining the use cases being worked on. This is especially important when you consider that existing architecture is frequently reused for new use cases.

In this book, you'll learn about the principles of distributed computing, data modeling techniques, and big data design patterns and templates that help solve end-to-end data flow problems for common scenarios and are reusable across use cases and industry verticals. You'll also learn how to recover from errors and the best practices around handling structured, semi-structured, and unstructured data using Delta. After that, you'll get to grips with features such as ACID transactions on big data, disciplined schema evolution, time travel to help rewind a dataset to a different time or version, and unified batch and streaming capabilities that will help you build agile and robust data products.

By the end of this Delta book, you'll be able to use Delta as the foundational block for creating analytics-ready data that fuels all AI/BI use cases.

What you will learn

  • Explore the key challenges of traditional data lakes
  • Appreciate the unique features of Delta that come out of the box
  • Address reliability, performance, and governance concerns using Delta
  • Analyze the open data format for an extensible and pluggable architecture
  • Handle multiple use cases to support BI, AI, streaming, and data discovery
  • Discover how common data and machine learning design patterns are executed on Delta
  • Build and deploy data and machine learning pipelines at scale using Delta

Who this book is for

Data engineers, data scientists, ML practitioners, BI analysts, or anyone in the data domain working with big data will be able to put their knowledge to work with this practical guide to executing pipelines and supporting diverse use cases using the Delta protocol. Basic knowledge of SQL, Python programming, and Spark is required to get the most out of this book.

Table of Contents

  1. An Introduction to Data Engineering
  2. Data Modeling and ETL
  3. Delta โ€“ The Foundation Block for Big Data
  4. Unifying Batch and Streaming with Delta
  5. Data Consolidation in Delta Lake
  6. Solving Common Data Pattern Scenarios with Delta
  7. Delta for Data Warehouse Use Cases
  8. Handling Atypical Data Scenarios with Delta
  9. Delta for Reproducible Machine Learning Pipelines
  10. Delta for Data Products and Services
  11. Operationalizing Data and ML Pipelines
  12. Optimizing Cost and Performance with Delta
  13. Managing Your Data Journey

๐Ÿ“œ SIMILAR VOLUMES


Simplifying Data Engineering and Analyti
โœ Anindita Mahapatra ๐Ÿ“‚ Library ๐Ÿ“… 2022 ๐Ÿ› Packt Publishing ๐ŸŒ English

<p><span>Explore how Delta brings reliability, performance, and governance to your data lake and all the AI and BI use cases built on top of it</span></p><h4><span>Key Features</span></h4><ul><li><span><span>Learn Delta's core concepts and features as well as what makes it a perfect match for data e

Simplifying Data Engineering and Analyti
โœ Anindita Mahapatra ๐Ÿ“‚ Library ๐Ÿ“… 2022 ๐ŸŒ English

Who this book is for...cunts Data engineers, data scientists, ML practitioners, BI analysts, or anyone in the data domain working with big data will be able to put their knowledge to work with this practical guide to executing pipelines and supporting diverse use cases using the Delta protocol. B

Data Analytics: Models and Algorithms fo
โœ Thomas A. Runkler (auth.) ๐Ÿ“‚ Library ๐Ÿ“… 2012 ๐Ÿ› Vieweg+Teubner Verlag ๐ŸŒ English

<p>This book is a comprehensive introduction to the methods and algorithms and approaches of modern data analytics. It covers data preprocessing, visualization, correlation, regression, forecasting, classification, and clustering. It provides a sound mathematical basis, discusses advantages and draw

Data Analytics: Models and Algorithms fo
โœ Thomas A. Runkler ๐Ÿ“‚ Library ๐Ÿ“… 2016 ๐Ÿ› Springer Vieweg ๐ŸŒ English

This book is a comprehensive introduction to the methods and algorithms of modern data analytics. It provides a sound mathematical basis, discusses advantages and drawbacks of different approaches, and enables the reader to design and implement data analytics solutions for real-world applications. T

Data Analytics: Models and Algorithms fo
โœ Thomas A. Runkler ๐Ÿ“‚ Library ๐Ÿ“… 2020 ๐Ÿ› Vieweg + Teubner Verlag ๐ŸŒ English

This book is a comprehensive introduction to the methods and algorithms of modern data analytics. It provides a sound mathematical basis, discusses advantages and drawbacks of different approaches, and enables the reader to design and implement data analytics solutions for real-world applications. T