Data confidentiality
โ Scribed by Jerome P. Reiter
- Book ID
- 104602992
- Publisher
- Wiley (John Wiley & Sons)
- Year
- 2011
- Tongue
- English
- Weight
- 132 KB
- Volume
- 3
- Category
- Article
- ISSN
- 0163-1829
- DOI
- 10.1002/wics.174
No coin nor oath required. For personal study only.
โฆ Synopsis
When releasing data to the public, data disseminators typically are required to protect the confidentiality of survey respondents' identities and attribute values. Removing direct identifiers such as names and addresses generally is not sufficient to eliminate disclosure risks, so that data must be altered before release to limit the risks of unintended disclosures. When intense data alteration is needed to ensure protection, the quality of the released data can be seriously degraded. This article reviews a disclosure limitation approach called synthetic data, in which values of confidential data are replaced with simulations from statistical models. Theoretical and empirical investigations have shown that synthetic data approaches have the potential to result in higher data quality than other disclosure limitation procedures, particularly when intense data alteration is necessary. The article discusses the main variants of synthetic data approaches, namely full synthesis and partial synthesis. It includes discussions of synthetic data generation and disclosure risk assessment.
๐ SIMILAR VOLUMES
The conventional approach to preserving the confidentiality of health records aggregates all records within a geographical area that has a population large enough to ensure prevention of disclosure. Though this approach normally protects the privacy of individuals, the use of such aggregated data li