Two de-identification methods, k-anonymization and adding a "fuzzy factor," significantly reduced the risk of re-identification of patients in a dataset of 5 million patient records from a large cervical cancer screening program in Norway, according to results published in Cancer Epidemiology, Biomarkers & Prevention, a journal of the American Association for Cancer Research.
"Researchers typically get access to de-identified data, that is, data without any personal identifying information, such as names, addresses, and Social Security numbers. However, this may not be sufficient to protect the privacy of individuals participating in a research study," said Giske Ursin, MD, PhD, director of Cancer Registry of Norway, Institute of Population-based Research.
Patient datasets often have sensitive data, such as information about a person's health and disease diagnosis that an individual may not want to share publicly, and data custodians are responsible for safeguarding such information, Ursin added. "People who have the permission to access such datasets have to abide by the laws and ethical guidelines, but there is always this concern that the data might fall into the wrong hands and be misused," she added. "As a data custodian, that's my worst nightmare."
http://www.aacr.org/Newsroom/Pages/News-Release-Detail.aspx?ItemID=1074
Journal reference:
Giske Ursin, Sagar Sen, Jean-Marie Mottu and Mari Nygård, Protecting Privacy in Large Datasets—First We Assess the Risk; Then We Fuzzy the Data, Cancer Epidemiology, Biomarkers & Prevention, http://dx.doi.org/10.1158/1055-9965.EPI-17-0172
-- submitted from IRC
(Score: 1, Touché) by Anonymous Coward on Tuesday August 01 2017, @09:06PM
Woo-hoo! Gonna identity theft your horse!