Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Thursday September 21 2017, @06:24AM   Printer-friendly
from the not-scrambled-enough dept.

Submitted via IRC for SoyCow5743

During last year's WWDC in June 2016, Apple noted it would be adopting some degree of differential privacy methods to ensure privacy while the company mined user data on iOS and Mac OS. In short, the technique adds noise to data that scrambles it enough to prevent it from becoming identifiable -- though the company made clear at the time that its data collection process was opt-in. Over a year later, a study claims that Apple's methods fall short of the digital privacy community's expectations for how much a user's data is kept private.

As they reveal in their study (PDF), researchers from the University of Southern California, Indiana University and China's Tsinghua University evaluated how Apple injects static into users' identifiable info, from messages to your internet history, to baffle anyone looking at the data, from the government to Apple's own staff. The metric for measuring a setup's differential privacy effectiveness is called a "privacy loss parameter" or, as a variable, "epsilon." In this case, the researchers discovered that Apple's epsilon on MacOS allowed a lot more personal data to be identifiable than digital privacy theorists are comfortable with, and iOS 10 permits even more.

Apple has refuted the study's findings, especially on its alleged ability to link data to particular users.

Source: https://www.engadget.com/2017/09/15/study-says-apple-data-mining-safeguards-dont-protect-privacy-en/


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Informative) by TheRaven on Thursday September 21 2017, @01:24PM

    by TheRaven (270) on Thursday September 21 2017, @01:24PM (#571136) Journal

    I don't know that specific paper, but in general the problem with these approaches is that they're either 5+ orders of magnitude slower than normal computation (i.e. if you can't do it on a laptop without doing the privacy stuff, you can't do it at all with), or they have O(N^M) memory usage, where N is the number of data elements and M is the number of primitive operations that you want to support. For example, if you want to support adding a known constant, that's one operation. If you want to support multiplying by a constant, then you can't do it by repeated addition, because you need to know both values so you need another primitive operation for that, and so on. The space requirements are generally prohibitive.

    Most of what people do in the real world involves throwing away data that they believe is identifying and then only using the aggregates, but that's been repeatedly shown to be flawed.

    --
    sudo mod me up
    Starting Score:    1  point
    Moderation   +1  
       Informative=1, Total=1
    Extra 'Informative' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3