Elsevier

Journal of Biomedical Informatics

Volume 50, August 2014, Pages 205-212
Journal of Biomedical Informatics

Privacy-preserving record linkage on large real world datasets

https://doi.org/10.1016/j.jbi.2013.12.003Get rights and content
Under an Elsevier user license
open archive

Highlights

  • Presents probabilistic linkage framework for Schnell’s bloom filter method.

  • Largest evaluation of bloom filter method for privacy preserving record linkage.

  • Shows bloom filter method suitable for large-scale linkage.

  • No difference in linkage quality between encrypted and unencrypted linkage.

  • Explores practical issues in conducting encrypted record linkage.

Abstract

Record linkage typically involves the use of dedicated linkage units who are supplied with personally identifying information to determine individuals from within and across datasets. The personally identifying information supplied to linkage units is separated from clinical information prior to release by data custodians. While this substantially reduces the risk of disclosure of sensitive information, some residual risks still exist and remain a concern for some custodians. In this paper we trial a method of record linkage which reduces privacy risk still further on large real world administrative data. The method uses encrypted personal identifying information (bloom filters) in a probability-based linkage framework. The privacy preserving linkage method was tested on ten years of New South Wales (NSW) and Western Australian (WA) hospital admissions data, comprising in total over 26 million records. No difference in linkage quality was found when the results were compared to traditional probabilistic methods using full unencrypted personal identifiers. This presents as a possible means of reducing privacy risks related to record linkage in population level research studies. It is hoped that through adaptations of this method or similar privacy preserving methods, risks related to information disclosure can be reduced so that the benefits of linked research taking place can be fully realised.

Keywords

Record linkage
Privacy preserving record linkage
Data integration
Bloom filters
Privacy preserving protocols
Population based research

Cited by (0)