Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Tuesday August 01 2017, @07:11PM   Printer-friendly
from the who's-on-first dept.

Two de-identification methods, k-anonymization and adding a "fuzzy factor," significantly reduced the risk of re-identification of patients in a dataset of 5 million patient records from a large cervical cancer screening program in Norway, according to results published in Cancer Epidemiology, Biomarkers & Prevention, a journal of the American Associ​ation for Cancer Research.

"Researchers typically get access to de-identified data, that is, data without any personal identifying information, such as names, addresses, and Social Security numbers. However, this may not be sufficient to protect the privacy of individuals participating in a research study," said Giske Ursin, MD, PhD, director of Cancer Registry of Norway, Institute of Population-based Research.

Patient datasets often have sensitive data, such as information about a person's health and disease diagnosis that an individual may not want to share publicly, and data custodians are responsible for safeguarding such information, Ursin added. "People who have the permission to access such datasets have to abide by the laws and ethical guidelines, but there is always this concern that the data might fall into the wrong hands and be misused," she added. "As a data custodian, that's my worst nightmare."

http://www.aacr.org/Newsroom/Pages/News-Release-Detail.aspx?ItemID=1074

Journal reference:
Giske Ursin, Sagar Sen, Jean-Marie Mottu and Mari Nygård, Protecting Privacy in Large Datasets—First We Assess the Risk; Then We Fuzzy the Data, Cancer Epidemiology, Biomarkers & Prevention, http://dx.doi.org/10.1158/1055-9965.EPI-17-0172

-- submitted from IRC


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by Runaway1956 on Wednesday August 02 2017, @12:04AM (1 child)

    by Runaway1956 (2926) Subscriber Badge on Wednesday August 02 2017, @12:04AM (#547789) Journal

    Why is it necessary to put all that identifying information into the database to start with? Your family doctor can treat you for whatever ails you, taking all of your information. Insurance, address, etc, ad nauseum. All those fields on his forms should just be flagged, so that those data bits never leave his office. If the data is never input into the database, the database can't leak the data.

    Of course, it becomes a minor issue to determine what must and must not be included in the data. Age is pertinent to many medical research projects. Ethnic background is important to some others. Medical people often demand information that is probably irrelevant to a lot of research, such as place of birth, number of siblings, and more. Being a twin/trip/octo MIGHT be important to some research, but that bit of data need not be available to the entire world of medical personnel.

    Clean up the input, and the output will require a lot less attention for "security".

    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 0) by Anonymous Coward on Wednesday August 02 2017, @03:13PM

    by Anonymous Coward on Wednesday August 02 2017, @03:13PM (#547950)

    Ah, just stick it in "The Cloud". Hey, ask IBM, like Sweden (Norway's neighbour) did. Certainly it will be fine, and secure!