๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

Data Lakehouse in Action: Architecting a modern and scalable data analytics platform

โœ Scribed by Pradeep Menon


Publisher
Packt Publishing
Year
2022
Tongue
English
Leaves
206
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


Propose a new scalable data architecture paradigm, Data Lakehouse, that addresses the limitations of current data architecture patterns

Key Features

  • Understand how data is ingested, stored, served, governed, and secured for enabling data analytics
  • Explore a practical way to implement Data Lakehouse using cloud computing platforms like Azure
  • Combine multiple architectural patterns based on an organization's needs and maturity level

Book Description

The Data Lakehouse architecture is a new paradigm that enables large-scale analytics. This book will guide you in developing data architecture in the right way to ensure your organization's success.

The first part of the book discusses the different data architectural patterns used in the past and the need for a new architectural paradigm, as well as the drivers that have caused this change. It covers the principles that govern the target architecture, the components that form the Data Lakehouse architecture, and the rationale and need for those components. The second part deep dives into the different layers of Data Lakehouse. It covers various scenarios and components for data ingestion, storage, data processing, data serving, analytics, governance, and data security. The book's third part focuses on the practical implementation of the Data Lakehouse architecture in a cloud computing platform. It focuses on various ways to combine the Data Lakehouse pattern to realize macro-patterns, such as Data Mesh and Data Hub-Spoke, based on the organization's needs and maturity level. The frameworks introduced will be practical and organizations can readily benefit from their application.

By the end of this book, you'll clearly understand how to implement the Data Lakehouse architecture pattern in a scalable, agile, and cost-effective manner.

What you will learn

  • Understand the evolution of the Data Architecture patterns for analytics
  • Become well versed in the Data Lakehouse pattern and how it enables data analytics
  • Focus on methods to ingest, process, store, and govern data in a Data Lakehouse architecture
  • Learn techniques to serve data and perform analytics in a Data Lakehouse architecture
  • Cover methods to secure the data in a Data Lakehouse architecture
  • Implement Data Lakehouse in a cloud computing platform such as Azure
  • Combine Data Lakehouse in a macro-architecture pattern such as Data Mesh

Who this book is for

This book is for data architects, big data engineers, data strategists and practitioners, data stewards, and cloud computing practitioners looking to become well-versed with modern data architecture patterns to enable large-scale analytics. Basic knowledge of data architecture and familiarity with data warehousing concepts are required.

Table of Contents

  1. Introducing the Evolution of Data Analytics Patterns
  2. The Data Lakehouse Architecture Overview
  3. Ingesting and Processing Data in a Lakehouse
  4. Storing and Serving Data in a Data Lakehouse
  5. Deriving Insights from a Data Lakehouse
  6. Applying Data Governance in a Data Lakehouse
  7. Applying Data Security in a Data Lakehouse
  8. Implementing a Data Lakehouse on Microsoft Azure
  9. Scaling the Data Lakehouse Architecture

โœฆ Table of Contents


Cover
Title Page
Copyright
Dedication
Contributors
Table of Contents
Preface
PART 1: Architectural Patterns for Analytics
Chapter 1: Introducing the Evolution of Data Analytics Patterns
Discovering the enterprise data warehouse era
Exploring the five factors of change
The exponential growth of data
The increase in compute
The decrease in storage cost
The rise of artificial intelligence
The advancement of cloud computing
Investigating the data lake era
Introducing the data lakehouse paradigm
Summary
Further reading
Chapter 2: The Data Lakehouse Architecture Overview
Developing a system context for a data lakehouse
Data providers
Data consumers
Developing a logical data lakehouse architecture
Data ingestion layer
Data lake layer
Data processing layer
Data serving layer
Data analytics layer
Data governance layer
Data security layer
Developing architecture principles
Disciplined at the core, flexible at the edges
Decouple compute and storage
Focus on functionality rather than technology
Create a modular architecture
Perform active cataloging
Summary
Further reading
PART 2: Using NLP to Accelerate Business Outcomes
Chapter 3: Ingesting and Processing Data in a Data Lakehouse
Ingesting and processing batch data
Differences between the ETL and ELT patterns
Batch data processing in a data lakehouse
Ingesting and processing streaming data
Streaming data sources
Extraction-load
Transform-load
Bringing it all together
The batch layer
The speed layer
The serving layer
Summary
Further reading
Chapter 4: Storing and Serving Data in a Data Lakehouse
Storing data in the data lake layer
Data lake layer
Common data formats
Storing data in the data serving layer
SQL-based serving
NoSQL-based serving
Data-sharing technology
Summary
Further reading
Chapter 5: Deriving Insights from the Data Lakehouse
Discussing the themes of analytics capabilities
Descriptive analytics
Advanced analytics
Enabling analytics capabilities in a data lakehouse
The analytics sandbox service
The business intelligence service
The AI service
Summary
Further reading
Chapter 6: Applying Data Governance in the Data Lakehouse
The 3-3-3 framework for data governance
The three objectives of data governance
The three pillars of data governance
The three components of the data governance layer
Implementing data governance policy management
Implementing the data catalog
Implementing data quality
Summary
Further reading
Chapter 7: Applying Data Security in a Data Lakehouse
Realizing the data security components in a data lakehouse
Using IAM in a data lakehouse
Methods of data encryption in a data lakehouse
Methods of data masking in a data lakehouse
Methods of implementing network security in a data lakehouse
Summary
Further reading
PART 3: Implementing and Governing a
Data Lakehouse
Chapter 8: Implementing a Data Lakehouse on Microsoft Azure
Why is cloud computing apt for implementing a data lakehouse?
The rapid advancements in cloud computing facilitate data analytics
Architectural flexibility is native to the cloud
Cloud computing enables tailored cost control
Implementing a data lakehouse on Microsoft Azure
The data ingestion layer on Microsoft Azure
The data processing layer on Microsoft Azure
Summary
Further reading
Chapter 9: Scaling the Data Lakehouse Architecture
The need for a macro-architectural pattern for analytics
Implementing a data lakehouse in a macro-architectural pattern
The hub-spoke pattern
The data mesh pattern
Choosing between hub-spoke and data mesh
Summary
Further reading
Index
About Packt
Other Books You May Enjoy


๐Ÿ“œ SIMILAR VOLUMES


Deciphering Data Architectures: Choosing
โœ James Serra ๐Ÿ“‚ Library ๐Ÿ“… 2023 ๐Ÿ› O'Reilly Media ๐ŸŒ English

<p>Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of each architecture to h

Deciphering Data Architectures: Choosing
โœ James Serra ๐Ÿ“‚ Library ๐Ÿ“… 2024 ๐Ÿ› O'Reilly Media ๐ŸŒ English

Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of each architecture to help

Deciphering Data Architectures: Choosing
โœ James Serra ๐Ÿ“‚ Library ๐Ÿ“… 2024 ๐Ÿ› O'Reilly Media ๐ŸŒ English

Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of each architecture to help

Practical Lakehouse Architecture: Design
โœ Gaurav Ashok Thalpati ๐Ÿ“‚ Library ๐Ÿ“… 2024 ๐Ÿ› O'Reilly Media ๐ŸŒ English

<p><span>This concise yet comprehensive guide explains how to adopt a data lakehouse architecture to implement modern data platforms. It reviews the design considerations, challenges, and best practices for implementing a lakehouse and provides key insights into the ways that using a lakehouse can i

Practical Lakehouse Architecture: Design
โœ Gaurav Ashok Thalpati ๐Ÿ“‚ Library ๐Ÿ“… 2024 ๐Ÿ› O'Reilly Media ๐ŸŒ English

<p><span>This concise yet comprehensive guide explains how to adopt a data lakehouse architecture to implement modern data platforms. It reviews the design considerations, challenges, and best practices for implementing a lakehouse and provides key insights into the ways that using a lakehouse can i

Practical Lakehouse Architecture: Design
โœ Gaurav Ashok Thalpati ๐Ÿ“‚ Library ๐Ÿ“… 2024 ๐Ÿ› O'Reilly Media ๐ŸŒ English

<p>This concise yet comprehensive guide explains how to adopt a data lakehouse architecture to implement modern data platforms. It reviews the design considerations, challenges, and best practices for implementing a lakehouse and provides key insights into the ways that using a lakehouse can impact