๐”– Scriptorium
โœฆ   LIBER   โœฆ

๐Ÿ“

MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 [Team-IRA]

โœ Scribed by Benjamin Perkins


Publisher
Sybex
Year
2023
Tongue
English
Leaves
1011
Edition
1
Category
Library

โฌ‡  Acquire This Volume

No coin nor oath required. For personal study only.

โœฆ Synopsis


Prepare for the Azure Data Engineering certificationโ€•and an exciting new career in analyticsโ€•with this must-have study aide

In the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203, accomplished data engineer and tech educator Benjamin Perkins delivers a hands-on, practical guide to preparing for the challenging Azure Data Engineer certification and for a new career in an exciting and growing field of tech.

In the book, youโ€™ll explore all the objectives covered on the DP-203 exam while learning the job roles and responsibilities of a newly minted Azure data engineer. From integrating, transforming, and consolidating data from various structured and unstructured data systems into a structure that is suitable for building analytics solutions, youโ€™ll get up to speed quickly and efficiently with Sybexโ€™s easy-to-use study aids and tools.

This Study Guide also offers:

  • Career-ready advice for anyone hoping to ace their first data engineering job interview and excel in their first day in the field
  • Indispensable tips and tricks to familiarize yourself with the DP-203 exam structure and help reduce test anxiety
  • Complimentary access to Sybexโ€™s expansive online study tools, accessible across multiple devices, and offering access to hundreds of bonus practice questions, electronic flashcards, and a searchable, digital glossary of key terms

A one-of-a-kind study aid designed to help you get straight to the crucial material you need to succeed on the exam and on the job, the MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203 belongs on the bookshelves of anyone hoping to increase their data analytics skills, advance their data engineering career with an in-demand certification, or hoping to make a career change into a popular new area of tech.

โœฆ Table of Contents


Cover Page
Title Page
Copyright Page
Acknowledgments
About the Author
About the Technical Editor
Contents at a Glance
Contents
Table of Exercises
Introduction
Part I Azure Data Engineer Certification and Azure Products
Chapter 1 Gaining the Azure Data Engineer Associate Certification
The Journey to Certification
How to Pass Exam DP-203
Understanding the Exam Expectations and Requirements
Use Azure Daily
Read Azure Articles to Stay Current
Have an Understanding of All Azure Products
Azure Product Name Recognition
Azure Data Analytics
Azure Synapse Analytics
Azure Databricks
Azure HDInsight
Azure Analysis Services
Azure Data Factory
Azure Event Hubs
Azure Stream Analytics
Other Products
Azure Storage Products
Azure Data Lake Storage
Azure Storage
Other Products
Azure Databases
Azure Cosmos DB
Azure SQL Server Products
Additional Azure Databases
Other Products
Azure Security
Azure Active Directory
Role-Based Access Control
Attribute-Based Access Control
Azure Key Vault
Other Products
Azure Networking
Virtual Networks
Other Products
Azure Compute
Azure Virtual Machines
Azure Virtual Machine Scale Sets
Azure App Service Web Apps
Azure Functions
Azure Batch
Azure Management and Governance
Azure Monitor
Azure Purview
Azure Policy
Azure Blueprints (Preview)
Azure Lighthouse
Azure Cost Management and Billing
Other Products
Summary
Exam Essentials
Review Questions
Chapter 2 CREATE DATABASE dbName
The Brainjammer
A Historical Look at Data
Variety
Velocity
Volume
Data Locations
Data File Formats
Data Structures, Types, and Concepts
Data Structures
Data Types and Management
Data Concepts
Data Programming and Querying for Data Engineers
Data Programming
Querying Data
Understanding Big Data Processing
Big Data Stages
ETL, ELT, ELTL
Analytics Types
Big Data Layers
Summary
Exam Essentials
Review Questions
Part II Design and Implement Data Storage
Chapter 3 Data Sources and Ingestion
Where Does Data Come From?
Design a Data Storage Structure
Design an Azure Data Lake Solution
Recommended File Types for Storage
Recommended File Types for Analytical Queries
Design for Efficient Querying
Design for Data Pruning
Design a Folder Structure That Represents the Levels of Data Transformation
Design a Distribution Strategy
Design a Data Archiving Solution
Design a Partition Strategy
Design a Partition Strategy for Files
Design a Partition Strategy for Analytical Workloads
Design a Partition Strategy for Efficiency and Performance
Design a Partition Strategy for Azure Synapse Analytics
Identify When Partitioning Is Needed in Azure Data Lake Storage Gen2
Design the Serving/Data Exploration Layer
Design Star Schemas
Design Slowly Changing Dimensions
Design a Dimensional Hierarchy
Design a Solution for Temporal Data
Design for Incremental Loading
Design Analytical Stores
Design Metastores in Azure Synapse Analytics and Azure Databricks
The Ingestion of Data into a Pipeline
Azure Synapse Analytics
Azure Data Factory
Azure Databricks
Event Hubs and IoT Hub
Azure Stream Analytics
Apache Kafka for HDInsight
Migrating and Moving Data
Summary
Exam Essentials
Review Questions
Chapter 4 The Storage of Data
Implement Physical Data Storage Structures
Implement Compression
Implement Partitioning
Implement Sharding
Implement Different Table Geometries with Azure Synapse Analytics Pools
Implement Data Redundancy
Implement Distributions
Implement Data Archiving
Azure Synapse Analytics Develop Hub
Implement Logical Data Structures
Build a Temporal Data Solution
Build a Slowly Changing Dimension
Build a Logical Folder Structure
Build External Tables
Implement File and Folder Structures for Efficient Querying and Data Pruning
Implement a Partition Strategy
Implement a Partition Strategy for Files
Implement a Partition Strategy for Analytical Workloads
Implement a Partition Strategy for Streaming Workloads
Implement a Partition Strategy for Azure Synapse Analytics
Design and Implement the Data Exploration Layer
Deliver Data in a Relational Star Schema
Deliver Data in Parquet Files
Maintain Metadata
Implement a Dimensional Hierarchy
Create and Execute Queries by Using a Compute Solution That Leverages SQL Serverless and Spark Cluster
Recommend Azure Synapse Analytics Database Templates
Implement Azure Synapse Analytics Database Templates
Additional Data Storage Topics
Storing Raw Data in Azure Databricks for Transformation
Storing Data Using Azure HDInsight
Storing Prepared, Trained, and Modeled Data
Summary
Exam Essentials
Review Questions
Part III Develop Data Processing
Chapter 5 Transform, Manage, and Prepare Data
Ingest and Transform Data
Transform Data Using Azure Synapse Pipelines
Transform Data Using Azure Data Factory
Transform Data Using Apache Spark
Transform Data Using Transact-SQL
Transform Data Using Stream Analytics
Cleanse Data
Split Data
Shred JSON
Encode and Decode Data
Configure Error Handling for the Transformation
Normalize and Denormalize Values
Transform Data by Using Scala
Perform Exploratory Data Analysis
Transformation and Data Management Concepts
Transformation
Data Management
Azure Databricks
Data Modeling and Usage
Data Modeling with Machine Learning
Usage
Summary
Exam Essentials
Review Questions
Chapter 6 Create and Manage Batch Processing and Pipelines
Design and Develop a Batch Processing Solution
Design a Batch Processing Solution
Develop Batch Processing Solutions
Create Data Pipelines
Handle Duplicate Data
Handle Missing Data
Handle Late-Arriving Data
Upsert Data
Configure the Batch Size
Configure Batch Retention
Design and Develop Slowly Changing Dimensions
Design and Implement Incremental Data Loads
Integrate Jupyter/IPython Notebooks into a Data Pipeline
Revert Data to a Previous State
Handle Security and Compliance Requirements
Design and Create Tests for Data Pipelines
Scale Resources
Design and Configure Exception Handling
Debug Spark Jobs Using the Spark UI
Implement Azure Synapse Link and Query the Replicated Data
Use PolyBase to Load Data to a SQL Pool
Read from and Write to a Delta Table
Manage Batches and Pipelines
Trigger Batches
Schedule Data Pipelines
Validate Batch Loads
Implement Version Control for Pipeline Artifacts
Manage Data Pipelines
Manage Spark Jobs in a Pipeline
Handle Failed Batch Loads
Summary
Exam Essentials
Review Questions
Chapter 7 Design and Implement a Data Stream Processing Solution
Develop a Stream Processing Solution
Design a Stream Processing Solution
Create a Stream Processing Solution
Process Time Series Data
Design and Create Windowed Aggregates
Process Data Within One Partition
Process Data Across Partitions
Upsert Data
Handle Schema Drift
Configure Checkpoints/Watermarking During Processing
Replay Archived Stream Data
Design and Create Tests for Data Pipelines
Monitor for Performance and Functional Regressions
Optimize Pipelines for Analytical or Transactional Purposes
Scale Resources
Design and Configure Exception Handling
Handle Interruptions
Ingest and Transform Data
Transform Data Using Azure Stream Analytics
Monitor Data Storage and Data Processing
Monitor Stream Processing
Summary
Exam Essentials
Review Questions
Part IV Secure, Monitor, and Optimize Data Storage and Data Processing
Chapter 8 Keeping Data Safe and Secure
Design Security for Data Policies and Standards
Design a Data Auditing Strategy
Design a Data Retention Policy
Design for Data Privacy
Design to Purge Data Based on Business Requirements
Design Data Encryption for Data at Rest and in Transit
Design Row-Level and Column-Level Security
Design a Data Masking Strategy
Design Access Control for Azure Data Lake Storage Gen2
Implement Data Security
Implement a Data Auditing Strategy
Manage Sensitive Information
Implement a Data Retention Policy
Encrypt Data at Rest and in Motion
Implement Row-Level and Column-Level Security
Implement Data Masking
Manage Identities, Keys, and Secrets Across Different Data Platform Technologies
Implement Access Control for Azure Data Lake Storage Gen2
Implement Secure Endpoints (Private and Public)
Implement Resource Tokens in Azure Databricks
Load a DataFrame with Sensitive Information
Write Encrypted Data to Tables or Parquet Files
Develop a Batch Processing Solution
Handle Security and Compliance Requirements
Design and Implement the Data Exploration Layer
Browse and Search Metadata in Microsoft Purview Data Catalog
Push New or Updated Data Lineage to Microsoft Purview
Summary
Exam Essentials
Review Questions
Chapter 9 Monitoring Azure Data Storage and Processing
Monitoring Data Storage and Data Processing
Implement Logging Used by Azure Monitor
Configure Monitoring Services
Understand Custom Logging Options
Measure Query Performance
Monitor Data Pipeline Performance
Monitor Cluster Performance
Measure Performance of Data Movement
Interpret Azure Monitor Metrics and Logs
Monitor and Update Statistics about Data Across a System
Schedule and Monitor Pipeline Tests
Interpret a Spark Directed Acyclic Graph
Monitor Stream Processing
Implement a Pipeline Alert Strategy
Develop a Batch Processing Solution
Design and Create Tests for Data Pipelines
Develop a Stream Processing Solution
Monitor for Performance and Functional Regressions
Design and Create Tests for Data Pipelines
Azure Monitoring Overview
Azure Batch
Azure Key Vault
Azure SQL
Summary
Exam Essentials
Review Questions
Chapter 10 Troubleshoot Data Storage Processing
Optimize and Troubleshoot Data Storage and Data Processing
Optimize Resource Management
Compact Small Files
Handle Skew in Data
Handle Data Spill
Find Shuffling in a Pipeline
Tune Shuffle Partitions
Tune Queries by Using Indexers
Tune Queries by Using Cache
Optimize Pipelines for Analytical or Transactional Purposes
Optimize Pipeline for Descriptive versus Analytical Workloads
Troubleshoot a Failed Spark Job
Troubleshoot a Failed Pipeline Run
Rewrite User-Defined Functions
Design and Develop a Batch Processing Solution
Design and Configure Exception Handling
Debug Spark Jobs by Using the Spark UI
Scale Resources
Monitor Batches and Pipelines
Handle Failed Batch Loads
Design and Develop a Stream Processing Solution
Optimize Pipelines for Analytical or Transactional Purposes
Handle Interruptions
Scale Resources
Summary
Exam Essentials
Review Questions
Appendix Answers to Review Questions
Chapter 1: Gaining the Azure Data Engineer Associate Certification
Chapter 2: CREATE DATABASE dbName
Chapter 3: Data Sources and Ingestion
Chapter 4: The Storage of Data
Chapter 5: Transform, Manage, and Prepare Data
Chapter 6. Create and Manage Batch Processing and Pipelines
Chapter 7: Design and Implement a Data Stream Processing Solution
Chapter 8: Keeping Data Safe and Secure
Chapter 9: Monitoring Azure Data Storage and Processing
Chapter 10: Troubleshoot Data Storage Processing
Index
EULA


๐Ÿ“œ SIMILAR VOLUMES


MCA Microsoft Certified Associate Azure
โœ Benjamin Perkins ๐Ÿ“‚ Library ๐Ÿ“… 2023 ๐Ÿ› Sybex ๐ŸŒ English

<p><span>Prepare for the Azure Data Engineering certificationโ€•and an exciting new career in analyticsโ€•with this must-have study aide</span></p><p><span>In the </span><span>MCA Microsoft Certified Associate Azure Data Engineer Study Guide: Exam DP-203</span><span>, accomplished data engineer and tech

MCA Microsoft Certified Associate Azure
โœ Puthiyavan Udayakumar, Kathiravan Udayakumar ๐Ÿ“‚ Library ๐Ÿ“… 2022 ๐Ÿ› Sybex ๐ŸŒ English

<p><span>Prepare to take the NEW Exam AZ-700 with confidence and launch your career as an Azure Network Engineer </span></p><p><span>Not only does </span><span>MCA Microsoft Certified Associate Azure Network Engineer Study Guide: Exam AZ-700</span><span> help you prepare for your certification exam,

MCA Microsoft Certified Associate Azure
โœ Shimon Brathwaite ๐Ÿ“‚ Library ๐Ÿ“… 2022 ๐Ÿ› Sybex ๐ŸŒ English

<p><span>Prepare for the MCA Azure Security Engineer certification exam faster and smarter with help from Sybex</span></p><p><span>In the </span><span>MCA Microsoft Certified Associate Azure Security Engineer Study Guide: Exam AZ-500</span><span>, cybersecurity veteran Shimon Brathwaite walks you th

MCA Microsoft Certified Associate Azure
โœ Shimon Brathwaite ๐Ÿ“‚ Library ๐Ÿ“… 2022 ๐Ÿ› John Wiley & Sons ๐ŸŒ English

Prepare for the MCA Azure Security Engineer certification exam faster and smarter with help from Sybex In the MCA Microsoft Certified Associate Azure Security Engineer Study Guide: Exam AZ-500, cybersecurity veteran Shimon Brathwaite walks you through every step you need to take to prepare for the M

MC Microsoft Certified Azure Data Fundam
โœ Jake Switzer ๐Ÿ“‚ Library ๐Ÿ“… 2022 ๐Ÿ› Sybex ๐ŸŒ English

<p><span>The most authoritative and complete study guide for people beginning to work with data in the Azure cloud</span></p><p><span>In </span><span>MC Azure Data Fundamentals Study Guide: Exam DP-900</span><span>, expert Cloud Solution Architect Jake Switzer delivers a hands-on blueprint to acing

MC Microsoft Certified Azure Data Fundam
โœ Jake Switzer ๐Ÿ“‚ Library ๐Ÿ“… 2022 ๐Ÿ› Sybex ๐ŸŒ English

<p><span>The most authoritative and complete study guide for people beginning to work with data in the Azure cloud</span></p><p><span>In </span><span>MC Azure Data Fundamentals Study Guide: Exam DP-900</span><span>, expert Cloud Solution Architect Jake Switzer delivers a hands-on blueprint to acing