Poor data quality can lead to increased costs, hinder revenue growth, compromise decision-making, and introduce risk into organizations. This leads to employees, customers, and suppliers finding every interaction with the organization frustrating. Practical Data Quality provides a comprehensive vie
Practical Data Quality: Learn practical, real-world strategies to transform the quality of data in your organization [Team-IRA]
β Scribed by Robert Hawker
- Publisher
- Packt Publishing
- Year
- 2023
- Tongue
- English
- Leaves
- 318
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
Identify data quality issues, leverage real-world examples and templates to drive change, and unlock the benefits of improved data in processes and decision-making
Key Features
- Get a practical explanation of data quality concepts and the imperative for change when data is poor
- Gain insights into linking business objectives and data to drive the right data quality priorities
- Explore the data quality lifecycle and accelerate improvement with the help of real-world examples
- Purchase of the print or Kindle book includes a free PDF eBook
Book Description
Poor data quality can lead to increased costs, hinder revenue growth, compromise decision-making, and introduce risk into organizations. This leads to employees, customers, and suppliers finding every interaction with the organization frustrating.
Practical Data Quality provides a comprehensive view of managing data quality within your organization, covering everything from business cases through to embedding improvements that you make to the organization permanently. Each chapter explains a key element of data quality management, from linking strategy and data together to profiling and designing business rules which reveal bad data. The book outlines a suite of tried-and-tested reports that highlight bad data and allow you to develop a plan to make corrections. Throughout the book, youβll work with real-world examples and utilize re-usable templates to accelerate your initiatives.
By the end of this book, youβll have gained a clear understanding of every stage of a data quality initiative and be able to drive tangible results for your organization at pace.
What you will learn
- Explore data quality and see how it fits within a data management programme
- Differentiate your organization from its peers through data quality improvement
- Create a business case and get support for your data quality initiative
- Find out how business strategy can be linked to processes, analytics, and data to derive only the most important data quality rules
- Monitor data through engaging, business-friendly data quality dashboards
- Integrate data quality into everyday business activities to help achieve goals
- Avoid common mistakes when implementing data quality practices
Who this book is for
This book is for data analysts, data engineers, and chief data officers looking to understand data quality practices and their implementation in their organization. This book will also be helpful for business leaders who see data adversely affecting their success and data teams that want to optimize their data quality approach. No prior knowledge of data quality basics is required.
Table of Contents
- The Impact of Data Quality on Organizations
- The Basics of Data Quality
- The Business Case for Data Quality
- Data Quality Roles and Their Challenges
- Data Discovery
- Data Quality Rules
- Monitoring Data Against Rules
- Data Quality Remediation
- Embedding Data Quality into Organizations
- Best Practices and Common Mistakes
β¦ Table of Contents
Cover
Title Page
Copyright and Credits
Dedication
Foreword
Contributors
Table of Contents
Preface
Part 1 β Getting Started
Chapter 1: The Impact of Data Quality on Organizations
The value of this book
Importance of executive support
Detailed definition of bad data
Bad data versus perfect data
Impact of bad data quality
Quantification of the impact of bad data
Impacts of bad data in depth
Process and efficiency impacts
Reporting and analytics impacts
Compliance impacts
Data differentiation impacts
Causes of bad data
Lack of a data culture
Prioritizing process speed over data governance
Mergers and acquisitions
Summary
References
Chapter 2: The Principles of Data Quality
Data quality in the wider context of data governance
Data governance as a discipline
Data governance tools and MDM
How data quality fits into data governance and MDM
Generally accepted principles and terminology of data quality
The basic terms of data quality defined
Data quality dimensions
Stakeholders in data quality initiatives
Different stakeholder types and their roles
The data quality improvement cycle
Business case
Data discovery
Rule development
Monitoring
Remediation
Embedding into BAU
Summary
References
Chapter 3: The Business Case for Data Quality
Activities, components, and costs
Activities in a data quality initiative
Early phases
Planning and business case phase
Developing quantitative benefit estimates
Example β the difficulty of calculating quantitative benefits
Strategies for quantification
Developing qualitative benefits
Surveys and focus groups
Outlining data quality qualitative risks in depth
Anticipating leadership challenges
The βExcel will do the jobβ challenge
Ownership of ongoing costs challenge
The excessive cost challenge
The βWhy do we need a data quality tool?β challenge
Summary
Chapter 4: Getting Started with a Data Quality Initiative
The first few weeks after budget approval
Key activities in those early weeks
Understanding data quality workstreams
Workstreams required early on
Identifying the right people for your team
Mapping resources to the workstreams
Summary
Part 2 β Understanding and Monitoring the Data That Matters
Chapter 5: Data Discovery
An overview of the data discovery process
Understanding business strategy, objectives, and challenges
Approaches to stakeholder identification
Content of stakeholder conversations
The hierarchy of strategy, objectives, processes, analytics, and data
Prioritizing using strategy
Linking challenges to processes, data, and reporting
Basics of data profiling
Typical tool data profiling capabilities
Using these capabilities
Connecting to data
Summary
Chapter 6: Data Quality Rules
An introduction to data quality rules
Rule scope
The key features of data quality rules
Rule weightings
Rule dimensions
Rule priorities
Rule thresholds
Cost per failure
Implementing data quality rules
Designing rules
Building data quality rules
Testing data quality rules
Summary
Chapter 7: Monitoring Data Against Rules
Introduction to data quality reporting
Different levels of reporting
Data security considerations
Designing a high-level data quality dashboard
Dimensions and filters
Designing a Rule Results Report
Typical features of the Rule Results Report
Designing Failed Data Reports
Typical features of the Failed Data Reports
Re-using Failed Data Reports
Multiple Failed Data Reports
Exporting Failed Data Reports
Managing inactive and duplicate data
Managing inactive data
Managing duplicate data
Detecting duplicates
Presenting findings to stakeholders
Launching data quality reporting successfully
Embedding reports into governance
Summary
Part 3 β Improving Data Quality for the Long Term
Chapter 8: Data Quality Remediation
Overall remediation process
Prioritizing remediation activities
Revisiting benefits
Approach to determining priorities
Identifying the approach to remediation
Typical remediation approaches
Matching issues to the correct approach
Moving remediation to business as usual
Understanding the effort and cost
Types of cost in remediation
Governing remediation activities
Key governance activities
Tracking benefits
Quantitative example
Qualitative benefit tracking
Summary
Chapter 9: Embedding Data Quality in Organizations
Preventing issue re-occurrence
Methods to prevent re-occurrence
The ongoing impact of human error
Short-horizon reporting
Ongoing data quality rule improvement
Strategies to identify rule changes
Updating data quality rules
Transitioning to day-to-day remediation
Requirements for success
Planning for a successful transition
Indications that the transition has been successful
Continuing the data quality journey
Roadmap of data quality initiatives
Identifying the next initiative
Obtaining support
What if no further initiative is sanctioned?
Summary
Chapter 10: Best Practices and Common Mistakes
Best practices
Selecting the best practices
Manage data quality primarily at the source
Implementing supporting governance meetings
Including data quality in an organization-wide education program
Leveraging the data steward and producer relationship
Best practices throughout this book
Common mistakes
Failure to implement best practices
A lack of practicality
Technically driven data quality rules
One-off remediation activity
Restricting access to data quality results
Avoid silos in data quality work
The future of data quality work
LLMs
Greater emphasis on high-quality data in organizations
Summary
Index
About Packt
Other Books You May Enjoy
π SIMILAR VOLUMES
<p>Lecturers/instructors only - request a free digital inspection copy hereΒ </p> <p>Written by an experienced researcher in the field of qualitative methods, this dynamic new book provides a definitive introduction to analysing qualitative data. </p> <p>It is a clear, accessible and practical guide
<div><p>There are awesome discoveries to be made and valuable stories to be told in datasets--and this book will help you uncover them. Whether you already work with data or just want to understand its possibilities, the techniques and advice in this practical book will help you learn how to better
<p><p>The issue of data quality is as old as data itself. However, the proliferation of diverse, large-scale and often publically available data on the Web has increased the risk of poor data quality and misleading data interpretations. On the other hand, data is now exposed at a much more strategic
<p><p>The issue of data quality is as old as data itself. However, the proliferation of diverse, large-scale and often publically available data on the Web has increased the risk of poor data quality and misleading data interpretations. On the other hand, data is now exposed at a much more strategic