<P>This book helps practitioners gain a deeper understanding, at an applied level, of the issues involved in improving data quality through editing, imputation, and record linkage. The first part of the book deals with methods and models. Here, we focus on the Fellegi-Holt edit-imputation model, the
Data Quality and Record Linkage Techniques
β Scribed by Thomas N. Herzog, Fritz J. Scheuren, William E. Winkler (auth.)
- Publisher
- Springer-Verlag New York
- Year
- 2007
- Tongue
- English
- Leaves
- 224
- Edition
- 1
- Category
- Library
No coin nor oath required. For personal study only.
β¦ Synopsis
This book helps practitioners gain a deeper understanding, at an applied level, of the issues involved in improving data quality through editing, imputation, and record linkage. The first part of the book deals with methods and models. Here, we focus on the Fellegi-Holt edit-imputation model, the Little-Rubin multiple-imputation scheme, and the Fellegi-Sunter record linkage model. Brief examples are included to show how these techniques work.
In the second part of the book, the authors present real-world case studies in which one or more of these techniques are used. They cover a wide variety of application areas. These include mortgage guarantee insurance, medical, biomedical, highway safety, and social insurance as well as the construction of list frames and administrative lists.
Readers will find this book a mixture of practical advice, mathematical rigor, management insight and philosophy. The long list of references at the end of the book enables readers to delve more deeply into the subjects discussed here. The authors also discuss the software that has been developed to apply the techniques described in our text.
Thomas N. Herzog, Ph.D., ASA is the Chief Actuary at the U.S. Department of Housing and Urban Development. He holds a Ph.D. in mathematics from the University of Maryland and is also an Associate of the Society of Actuaries. He is the author or co-author of books on Credibility Theory, Monte Carlo Methods, and Models for Quantifying Risk.
Fritz J. Scheuren, Ph.D., is a Vice President for Statistics with the National Opinion Research Center at the University of Chicago. He has a Ph.D. in statistics from the George Washington University. He is much published with over 300 papers and monographs. He is the 100th President of the American Statistical Association and a Fellow of both the American Statistical Association and the American Association for the Advancement of Science.
William E. Winkler, Ph.D., is Principal Researcher at the U.S. Census Bureau. He holds a Ph.D. in probability theory from Ohio State University and is a Fellow of the American Statistical Association. He has more than 130 papers in areas such as automated record linkage and data quality. He is the author or co-author of eight generalized software systems, some of which are used for production in the largest survey and administrative-list situations.
β¦ Table of Contents
Front Matter....Pages 1-1
Introduction....Pages 1-3
Front Matter....Pages 5-5
What is Data Quality and Why Should We Care?....Pages 7-15
Examples of Entities Using Data\break to their Advantage/Disadvantage....Pages 17-27
Properties of Data Quality and Metrics for Measuring It....Pages 29-35
Basic Data Quality Tools....Pages 37-48
Front Matter....Pages 49-49
Mathematical Preliminaries for Specialized Data Quality Techniques....Pages 51-60
Automatic Editing and Imputation of Sample Survey Data....Pages 61-80
Record Linkage β Methodology....Pages 81-92
Estimating the Parameters of the FellegiβSunter Record Linkage Model....Pages 93-106
Standardization and Parsing....Pages 107-114
Phonetic Coding Systems for Names....Pages 115-121
Blocking....Pages 123-130
String Comparator Metrics for Typographical Error....Pages 131-135
Front Matter....Pages 137-137
Duplicate FHA Single-Family Mortgage Records....Pages 139-149
Record Linkage Case Studies in the Medical, Biomedical, and Highway Safety Areas....Pages 151-158
Constructing List Frames and Administrative Lists....Pages 159-168
Social Security and Related Topics....Pages 169-177
Front Matter....Pages 179-179
Confidentiality: Maximizing Access to Micro-data while Protecting Privacy....Pages 181-199
Review of Record Linkage Software....Pages 201-207
Summary Chapter....Pages 209-210
β¦ Subjects
Database Management; Statistical Theory and Methods; Statistics for Social Science, Behavorial Science, Education, Public Policy, and Law
π SIMILAR VOLUMES
<p><span>This book helps practitioners gain a working understanding of issues involved in improving data quality through editing, imputation, and record linkage. The first part of the book deals with methods and models, focusing on the Fellegi-Holt edit-imputation model, the Little-Rubin multiple-im
<p><p>Data matching (also known as record or data linkage, entity resolution, object identification, or field matching) is the task of identifying, matching and merging records that correspond to the same entities from several databases or even within one database. Based on research in various domai
<p><p>Data matching (also known as record or data linkage, entity resolution, object identification, or field matching) is the task of identifying, matching and merging records that correspond to the same entities from several databases or even within one database. Based on research in various domai
<p><P>Poor data quality can seriously hinder or damage the efficiency and effectiveness of organizations and businesses. The growing awareness of such repercussions has led to major public initiatives like the "Data Quality Act" in the USA and the "European 2003/98" directive of the European Parliam