𝔖 Scriptorium
✦   LIBER   ✦

πŸ“

Implementing a Modern Data Catalog to Power Data Intelligence: Make Trustworthy Data Central to Your Organization

✍ Scribed by Fadi Maali, Jason Lim


Publisher
O’Reilly Media, Inc.
Year
2023
Tongue
English
Leaves
38
Category
Library

⬇  Acquire This Volume

No coin nor oath required. For personal study only.

✦ Synopsis


Are you looking to use data as a strategic asset in your organization, so that more people can make better, data-driven decisions and accelerate time to value? This report explains how. Whether you're working on self-service analytics, data governance, or cloud data migration, authors Fadi Maali, an experienced data engineer and the lead editor of the DCAT Specification, and Jason Lim, director of product and cloud marketing at Alation, show you why a data catalog is the starting point and center of all of it.

Modern data catalogs are collections of metadata describing data assets and their usage. They provide relevant functionality to support metadata management, enrichment, and search. Not only do these catalogs help you find relevant data, they also guide you through the data's proper use. This report shows you how a data catalog can help you easily find and then use the data you need.

A data catalog is a collection of metadata describing data assets and their usage. Modern data catalogs provide relevant functionality to support metadata management, enrichment, and search. They not only help users find relevant data but guide them on proper use of that data. Data catalogs help answer the questions:

β€’ How can I find relevant data?
β€’ Once I find data, can I use it?
β€’ Should I use it?
β€’ How should I use it?

Cataloging and managing metadata in enterprises is not a new practice. Metadata repositories have existed since the 1970s and relational databases have had metadata catalogs since their early days. However, in the years since, the technology surrounding data and the role of data in the enterprise have both changed substantially.

Enterprise data landscapes have grown more sophisticatedβ€”the β€œ3 Vs” of big data (volume, velocity, and variety) are widely known. And the legislative environment mandating compliant data usage continues to grow in complexity as more people (and AI-powered programs) access and use data in new ways.1 Moreover, the growing adoption of cloud computing and SaaS results in more data residing outside the enterprise infrastructure and control. As a result, collecting, managing, and using comprehensive and accurate metadata has become paramount; and modern data catalogs are the tools that enable best practices.

Modern data catalogs have grown in maturity and sophistication to address new and increasingly complex challenges. They now provide a comprehensive set of functionalities to integrate with other enterprise data tools and to support automatic collection and enrichment of metadata, using advanced techniques such as machine learning, natural language processing, and crowdsourcing.

✦ Table of Contents


  1. Data Catalogs
    What Is in a Data Catalog?
    Data Catalog Features and Example Applications
    A Framework to Characterize Data Catalogs
    Summary
  2. Types of Data Catalogs
    Tool-Adjunct Data Catalogs
    Broad Connectivity
    Intelligence
    Active Governance
    Domain-Specific Catalogs
    Broad Connectivity
    Intelligence
    Active Governance
    Data Catalog Platforms
    Broad Connectivity
    Intelligence
    Active Governance
    Summary
  3. Implementing a Data Catalog
    Data Catalog in an Enterprise Data Stack
    Enterprise Data Lakes
    The Modern Data Stack
    Data Mesh
    Data Fabric
    Successful Implementation of Data Catalogs
    Accommodate Existing Workflows for Data Users
    Focus on People
    Focus on Business and Technical Metadata
    Have an Adoption Plan
    Measure Adoption and Impact of the Data Catalog
    Summary
  4. Enterprise Data Catalog Business Impact
    Catalog Business Impact
    Catalog Use Cases
    Self-Service Business Intelligence
    Data Governance and Guided Data Usage
    Data Operations
    Cloud and Multicloud Migration
    Summary
  5. Conclusion
    About the Authors

πŸ“œ SIMILAR VOLUMES


Fundamentals of Data Observability: Impl
✍ Andy Petrella πŸ“‚ Library πŸ“… 2023 πŸ› O'Reilly Media 🌐 English

<p><span>Quickly detect, troubleshoot, and prevent a wide range of data issues through data observability, a set of best practices that enables data teams to gain greater visibility of data and its usage. If you're a data engineer, data architect, or machine learning engineer who depends on the qual

Fundamentals of Data Observability: Impl
✍ Andy Petrella πŸ“‚ Library πŸ“… 2023 πŸ› O'Reilly Media 🌐 English

Quickly detect, troubleshoot, and prevent a wide range of data issues through data observability, a set of best practices that enables data teams to gain greater visibility of data and its usage. If you're a data engineer, data architect, or machine learning engineer who depends on the quality of yo

Guide to Intelligent Data Analysis: How
✍ Michael R. Berthold, Christian Borgelt, Frank HΓΆppner, Frank Klawonn πŸ“‚ Library πŸ“… 2010 πŸ› Springer 🌐 English

<p><p>Each passing year bears witness to the development of ever more powerful computers, increasingly fast and cheap storage media, and even higher bandwidth data connections. This makes it easy to believe that we can now – at least in principle - solve any problem we are faced with so long as we o

Guide to Intelligent Data Analysis: How
✍ Berthold, Michael R.; Borgelt, Christian; HΓΆppner, Frank et al. πŸ“‚ Library πŸ“… 2010 πŸ› Springer London : Imprint : Springer 🌐 English

Each passing year bears witness to the development of ever more powerful computers, increasingly fast and cheap storage media, and even higher bandwidth data connections. This makes it easy to believe that we can now - at least in principle - solve any problem we are faced with so long as we only ha