More organizations than ever understand the importance of data lake architectures for deriving value from their data. Building a robust, scalable, and performant data lake remains a complex proposition, however, with a buffet of tools and options that need to work together to provide a seamless end-
The Cloud Data Lake: A Guide to Building Robust Cloud Data Architecture
โ Scribed by Rukmani Gopalan
- Publisher
- O'Reilly Media
- Year
- 2023
- Tongue
- English
- Leaves
- 116
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
โฆ Synopsis
More organizations than ever understand the importance of data lake architectures for deriving value from their data. Building a robust, scalable, and performant data lake remains a complex proposition, however, with a buffet of tools and options that need to work together to provide a seamless end-to-end pipeline from data to insights.
This book provides a concise yet comprehensive overview on the setup, management, and governance of a cloud data lake. Author Rukmani Gopalan, a product management leader and data enthusiast, guides data architects and engineers through the major aspects of working with a cloud data lake, from design considerations and best practices to data format optimizations, performance optimization, cost management, and governance.
- Learn the benefits of a cloud-based big data strategy for your organization
- Get guidance and best practices for designing performant and scalable data lakes
- Examine architecture and design choices, and data governance principles and strategies
- Build a data strategy that scales as your organizational and business needs increase
- Implement a scalable data lake in the cloud
- Use cloud-based advanced analytics to gain more value from your data
โฆ Table of Contents
- Big Data - Beyond the Buzz
1.1 What is Big Data?
1.2 Elastic Data Infrastructure - The Challenge
1.3 Cloud Computing Fundamentals
1.3.1 Value Proposition of the Cloud
1.4 Cloud Data Lake - Value Proposition
1.4.1 Limitations of on-premises data warehouse solutions
1.4.2 Big Data Processing on the Cloud
1.4.3 Benefits of a Cloud Data Lake Architecture
1.5 Defining your Cloud Data Lake Journey
Summary - Big Data Architectures on the Cloud
2.1 Why Klodars Corporation moves to the cloud
2.2 Fundamentals of Cloud Data Lake Architectures
2.2.1 A Word on Variety of Data
2.2.2 Cloud Data Lake Storage
2.2.3 Big Data Analytics Engines
2.2.4 Cloud Data Warehouses
2.3 Modern Data Warehouse Architecture
2.3.1 Reference Architecture
2.3.2 Sample Use case for a Modern Data Warehouse Architecture
2.3.3 Benefits and Challenges of Modern Data Warehouse Architecture
2.4 Data Lakehouse Architecture
2.4.1 Reference architecture for Data Lakehouse
2.4.2 Sample Use case for Data Lakehouse Architecture
2.4.3 Benefits and Challenges of Data Lakehouse Architecture
2.4.4 Data warehouses and unstructured data
2.5 Data Mesh
2.5.1 Reference architecture
2.5.2 Sample Use Case for a Data Mesh Architecture
2.5.3 Challenges and Benefits of a Data Mesh Architecture
2.6 What is the right architecture for me?
2.6.1 Know your customers
2.6.2 Know your business drivers
2.6.3 Consider your growth and future scenarios
2.6.4 Design considerations
2.6.5 Hybrid approaches
Summary - Design Considerations for Your Data Lake
3.1 Setting Up the Cloud Data Lake Infrastructure
3.1.1 Identify your goals
3.1.2 Plan your architecture and deliverables
3.1.3 Implement the cloud data lake
3.1.4 Release and operationalize
3.2 Organizing data in your data lake
3.2.1 A day in the life of data
3.2.2 Data Lake Zones
3.2.3 Organization mechanisms
3.3 Introduction to data governance
3.3.1 Actors involved in data governance
3.3.2 Data Classification
3.3.3 Metadata management, Data catalog, and Data sharing
3.3.4 Data Access Management
3.3.5 Data Quality and observability
3.3.6 Data Governance at Klodars Corportation
3.3.7 Data governance wrap up
3.4 Manage data lake costs
3.4.1 Demystifying data lake costs on the cloud
3.4.2 Data Lake Cost Strategy
Summary
๐ SIMILAR VOLUMES
<p>More organizations than ever understand the importance of data lake architectures for deriving value from their data. Building a robust, scalable, and performant data lake remains a complex proposition, however, with a buffet of tools and options that need to work together to provide a seamless e
<p><span>More organizations than ever understand the importance of data lake architectures for deriving value from their data. Building a robust, scalable, and performant data lake remains a complex proposition, however, with a buffet of tools and options that need to work together to provide a seam
Learn using Cloud data technologies for improving data analytics and decision-making capabilities for your organization. Description Cloud data architectures are a valuable tool for organizations that want to use data to make better decisions. By understanding the different components of Cloud d
<p><i>Cloud Data Centers and Cost Modeling</i> establishes a framework for strategic decision-makers to facilitate the development of cloud data centers. Just as building a house requires a clear understanding of the blueprints, architecture, and costs of the project; building a cloud-based data cen