Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 15 submissions in the queue.
posted by janrinok on Sunday November 12 2023, @10:42AM   Printer-friendly
from the target-marketing dept.

From The Electronic Frontier Foundation: Debunking the Myth of "Anonymous" Data

Personal information that corporations collect from our online behaviors sells for astonishing profits and incentivizes online actors to collect as much as possible. Every mouse click and screen swipe can be tracked and then sold to ad-tech companies and the data brokers that service them.

In an attempt to justify this pervasive surveillance ecosystem, corporations often claim to de-identify our data. This supposedly removes all personal information (such as a person's name) from the data point (such as the fact that an unnamed person bought a particular medicine at a particular time and place). Personal data can also be aggregated, whereby data about multiple people is combined with the intention of removing personal identifying information and thereby protecting user privacy.

...

However, in practice, any attempt at de-identification requires removal not only of your identifiable information, but also of information that can identify you when considered in combination with other information known about you. Here's an example:

  • First, think about the number of people that share your specific ZIP or postal code.
  • Next, think about how many of those people also share your birthday.
  • Now, think about how many people share your exact birthday, ZIP code, and gender.

According to one landmark study, these three characteristics are enough to uniquely identify 87% of the U.S. population. A different study showed that 63% of the U.S. population can be uniquely identified from these three facts.

We cannot trust corporations to self-regulate. The financial benefit and business usefulness of our personal data often outweighs our privacy and anonymity. In re-obtaining the real identity of the person involved (direct identifier) alongside a person's preferences (indirect identifier), corporations are able to continue profiting from our most sensitive information. For instance, a website that asks supposedly "anonymous" users for seemingly trivial information about themselves may be able to use that information to make a unique profile for an individual.


Original Submission

Related Stories

The Internet Enabled Mass Surveillance. A.I. Will Enable Mass Spying 30 comments

Spying has always been limited by the need for human labor. A.I. is going to change that:

Spying and surveillance are different but related things. If I hired a private detective to spy on you, that detective could hide a bug in your home or car, tap your phone, and listen to what you said. At the end, I would get a report of all the conversations you had and the contents of those conversations. If I hired that same private detective to put you under surveillance, I would get a different report: where you went, whom you talked to, what you purchased, what you did.

Before the internet, putting someone under surveillance was expensive and time-consuming. You had to manually follow someone around, noting where they went, whom they talked to, what they purchased, what they did, and what they read. That world is forever gone. Our phones track our locations. Credit cards track our purchases. Apps track whom we talk to, and e-readers know what we read. Computers collect data about what we're doing on them, and as both storage and processing have become cheaper, that data is increasingly saved and used. What was manual and individual has become bulk and mass. Surveillance has become the business model of the internet, and there's no reasonable way for us to opt out of it.

Spying is another matter. It has long been possible to tap someone's phone or put a bug in their home and/or car, but those things still require someone to listen to and make sense of the conversations. Yes, spyware companies like NSO Group help the government hack into people's phones, but someone still has to sort through all the conversations. And governments like China could censor social media posts based on particular words or phrases, but that was coarse and easy to bypass. Spying is limited by the need for human labor.

A.I. is about to change that.

[...] We could limit this capability. We could prohibit mass spying. We could pass strong data-privacy rules. But we haven't done anything to limit mass surveillance. Why would spying be any different?

Related:


Original Submission

This discussion was created by janrinok (52) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 5, Insightful) by Runaway1956 on Sunday November 12 2023, @11:16AM

    by Runaway1956 (2926) Subscriber Badge on Sunday November 12 2023, @11:16AM (#1332592) Journal

    Big Tech is not entitled to your data, my data, anyone's data. Sure, there's talk about protecting the children - that is insufficient. Politicians ensure that they have special protections. That's a joke.

    Simply outlaw the data collection, outlaw the data brokers, outlaw the buying and selling completely. There used to be laws against peeping Toms. Today, Tom sits in your pocket, or on your desk, peering out at you, all day, every day.

    Just put the parasites out of business, end of story.

  • (Score: 5, Insightful) by Rosco P. Coltrane on Sunday November 12 2023, @11:42AM

    by Rosco P. Coltrane (4757) on Sunday November 12 2023, @11:42AM (#1332595)

    but some should be if it wasn't forcibly extracted.

    People leak data all the time. And putting the pieces together is arguably fair game.

    Think of it as you leaving your fingerprints on everything you touch: you can't prevent someone from coming and lifting your fingerprints if they want to (not talking about what they do with them later though, just the lifting). Your fingerprints are out there. Those who want to collect fingerprints can almost "passively" collect them after you've been somewhere.

    However, you don't involuntarily leak other data all the time. Like your sexual preferences or your political views. If you don't actively broadcast the information, including stepping into a sex shop and running into someone you know, or picketing in the street, then nobody can - or should - know.

    Data collection of information you don't naturally leak all the time takes a deliberate effort to collect. Information that can't be obtained passively and that requires an effort to collect is what's problematic.

    In the case of Big Data, the deliberate effort to collect information you don't naturally give away comes in the form of setting up a very successful search engine, putting trackers all over the internet that you can't avoid, running intrusive scripts on your computer... The key thing here is they actively seek out the information: if they quit gaming the internet to collect it, the stream of information would stop. Because again, that's information that you don't normally passively leak out.

    That's where I draw the line personally. Any information obtained as a result of a concerted and deliberate effort to pry it out of people without their consent of their knowledge, when those people wouldn't normally give away the information, should be regarded as a violation of privacy and criminalized.

    But of course, in a society where lawmakers are bought and sold on the marketplace, this is never going to happen...

  • (Score: 2) by RedGreen on Sunday November 12 2023, @12:17PM

    by RedGreen (888) on Sunday November 12 2023, @12:17PM (#1332596)

    How dare the EFF suggest those kind loving pillars of our society are no good slimy bastards only out for the cash. That the parasite corporations and the freaks that run them have no concern for their customers other than seeing them as marks to be exploited for any and all cash they can get from them by whatever means they can.

    --
    "I modded down, down, down, and the flames went higher." -- Sven Olsen
  • (Score: 4, Insightful) by Opportunist on Sunday November 12 2023, @12:41PM (3 children)

    by Opportunist (5545) on Sunday November 12 2023, @12:41PM (#1332599)

    So replace it with bogus data.

    Fill everything you are presented with with crap. My zip code? Anything but mine. My interests? Anything but mine.

    Poison the data well with crap.

    • (Score: 5, Interesting) by bzipitidoo on Monday November 13 2023, @03:12AM

      by bzipitidoo (4388) on Monday November 13 2023, @03:12AM (#1332653) Journal

      Some caveats. In a lot of cases completely bogus data doesn't work. You can't use a non-existent zip code. Typically, websites that demand zip code info check that. You have to use a real zip code. Same with a street address. At first, I just made something up, something plausible and easy to remember, such as 777 7th street, but when the site actually checked that address and couldn't find it, I had to do something else. I ended up giving it a real address, to a car dealership in a different city. It may be only a matter of time before the software says, in effect, "hey, you liar, that's not a residence!" Or, "John Smith does not live at that address!"

      I have also tried the fake credit card number to get around those sites that advertise "free" trial, but demand a credit card # before you can utilize their so-called free trial. A completely fake number will be rejected immediately for not satisfying whatever validity checking algorithm credit card companies have built into the number. There are sites that will generate valid CC#'s, but these too fail when the numbers can't be matched up with an existing account. In such cases, I walk. I will not give out my real CC# for a "free" trial.

      Then there's the crap about providing a phone number to which a code can be texted. They disingenuously claim it's for security, but I know they care more about harvesting your data. There are web sites that can receive texts at their phone numbers. I've never had any luck getting that to work. The originating system invariably has some sort of blacklist, and equally invariably, those numbers are on it. You have to use a burner phone to stay anonymous.

      Then there's sites such as bugmenot [bugmenot.com]. Most of their logins do not work. I have instead resorted to deleting cookies to get around their treacherous use of them to shut you down after you've read whatever number of articles they set as the limit for a free trial. If that doesn't work either, I simply stop visiting those sites.

    • (Score: 2) by Tork on Monday November 13 2023, @03:43AM

      by Tork (3914) Subscriber Badge on Monday November 13 2023, @03:43AM (#1332659)
      This seems like a good application for AI. Id love to spawn ten apps that convincingly pretend to be users on my network and browse the web with their own interests. Heck, I wouldn't even care if it posted on the green site. These days it might even improve the SNR.
      --
      🏳️‍🌈 Proud Ally 🏳️‍🌈
    • (Score: 2) by VLM on Monday November 13 2023, @05:07PM

      by VLM (445) on Monday November 13 2023, @05:07PM (#1332743)

      Poison the data well with crap.

      Sometimes you're not allowed to falsify your demographic information.

      A workplace classic is the old "anonymous survey" where they break down the results of a dozen person department into "100% of 37 year olds who've worked here between 2 and 3 years responded to the question with the following answer". With only a dozen person department that's deanonymized down to one person, not a population or group.

      Now a days they don't even bother. Whenever you see "please don't forward this survey because your link is personalized for you" that means they already have a file on you and will link your answers directly to it.

  • (Score: 4, Insightful) by BsAtHome on Sunday November 12 2023, @12:45PM

    by BsAtHome (889) on Sunday November 12 2023, @12:45PM (#1332600)

    Unfortunately, most people either don't care about or don't understand the problem. Indifference and ignorance are very huge hurdles.

    In this light, regulation is all the more important because the rulers in a representative democracy are supposed to decide for the good, health and wealth of the commons. It is therefore very unfortunate that we only see the best regulation money can buy, instead of the best regulation the commons requires.

  • (Score: 5, Informative) by pTamok on Sunday November 12 2023, @03:04PM (5 children)

    by pTamok (3042) on Sunday November 12 2023, @03:04PM (#1332605)

    The GDPR is very clear about what constitutes personal data, but I suspect a lot of people misinterpret it, either through ignorance, or through wilful misinterpretation.

    Unfortunately, the EU publishes its legal texts in ways that make them difficult to quickly get an overview of, but the official text, in English, is here:

    REGULATION (EU) 2016/679 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) [europa.eu]

    There are other sites that have easier URLs:

    1) An EU site: European Commission: What is personal data? [europa.eu]

    2) An EU funded site, set up by Proton AG: Complete guide to GDPR compliance: General Data Protection Regulation (GDPR): Article 4 : Definitions

    3) An independent site: Intersoft Consulting: Article 4 Definitions [gdpr-info.eu]

    The EU site gives a comprehensive and detailed answer with links to legislation. It's not just GDPR Article 4.

    But if you take GDPR Article 4, the definition of personal data is given as:

    For the purposes of this Regulation:

    (1)‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;

    (2)‘processing’ means any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means, such as collection, recording, organisation, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction;

    (3)‘restriction of processing’ means the marking of stored personal data with the aim of limiting their processing in the future;

    (4)‘profiling’ means any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural person, in particular to analyse or predict aspects concerning that natural person's performance at work, economic situation, health, personal preferences, interests, reliability, behaviour, location or movements;

    (5)‘pseudonymisation’ means the processing of personal data in such a manner that the personal data can no longer be attributed to a specific data subject without the use of additional information, provided that such additional information is kept separately and is subject to technical and organisational measures to ensure that the personal data are not attributed to an identified or identifiable natural person;

    (6)‘filing system’ means any structured set of personal data which are accessible according to specific criteria, whether centralised, decentralised or dispersed on a functional or geographical basis;

    I've highlighted 'or indirectly'

    The EU website clarifies further:

    Personal data is any information that relates to an identified or identifiable living individual. Different pieces of information, which collected together can lead to the identification of a particular person, also constitute personal data.

    Personal data that has been de-identified, encrypted or pseudonymised but can be used to re-identify a person remains personal data and falls within the scope of the GDPR.

    Personal data that has been rendered anonymous in such a way that the individual is not or no longer identifiable is no longer considered personal data. For data to be truly anonymised, the anonymisation must be irreversible.

    The GDPR protects personal data regardless of the technology used for processing that data – it’s technology neutral and applies to both automated and manual processing, provided the data is organised in accordance with pre-defined criteria (for example alphabetical order). It also doesn’t matter how the data is stored – in an IT system, through video surveillance, or on paper; in all cases, personal data is subject to the protection requirements set out in the GDPR.

    I've highlighted a few points.

    If a person can be identified by combining different sets of records, those records constitute personal data.

    It is clear; and ignored for convenience by huge numbers of people and organisations, because following the GDPR properly is hard.

    • (Score: 3, Informative) by Runaway1956 on Sunday November 12 2023, @03:46PM (2 children)

      by Runaway1956 (2926) Subscriber Badge on Sunday November 12 2023, @03:46PM (#1332608) Journal

      It may be worth remembering IBM's contribution to the holocaust. They made it possible to categorize and deanonymize and track millions of people. The Nazis were truly grateful for that contribution.

      • (Score: 5, Insightful) by pTamok on Sunday November 12 2023, @04:06PM

        by pTamok (3042) on Sunday November 12 2023, @04:06PM (#1332611)

        There were many contributors, both witting and unwitting.

        The pre-war Dutch government helped, by keeping good records of the religion of people living in the Netherlands. Was it necessary? Who knows, But it allowed the German invading force to quickly single out that sector of the population. The Dutch resistance tried to destroy records [wikipedia.org].

        It's a good example of what happens when you allow a benign government to keep apparently benign records. You never know when a regime might change, and innocuous behaviour before the change becomes a liability. Anyone with a university degree was targetted in Cambodia when Pol Pot achieved power [wikipedia.org].

        A good rule of thumb is to collect as little data as possible to do what you need, and destroy it as soon as possible afterwards. Having data hand around is a liability. Only collect what is necessary, and keep it for a short a time as possible.

        Meanwhile, modern practices appear to be 'collect it all'; generate a central ID database linked to all your government records; keep for as long as possible.

        What could possibly go wrong?

        The point is not whether you trust the current data collectors to 'do no evil', but what about the possible future inheritors of that data, who you don't know. If someone wanted to use it in the least benign way possible, would you be worried?

      • (Score: 5, Interesting) by pTamok on Sunday November 12 2023, @07:35PM

        by pTamok (3042) on Sunday November 12 2023, @07:35PM (#1332621)

        Oh, and while I am at it.

        The Nazis. Or, to give the full name of the political party the Nationalsozialistische Deutsche Arbeiterpartei (NSDAP - The National Socialist German Worker's Party). In the free-ish* elections of July 1932, they got 37.2% of the vote on a turnout of 84.1% of the electorate. The Nazis were not a tiny minority - it's 31% of the electorate. Note that President Trump, in the 2016 Presidential elections, got the vote of 27.3% of the electorate.

        If, as a German, you think the Nazis were bad for Germany, you can see that voting for them, even if holding your nose 'for want of a better candidate' didn't necessarily give you the result you wanted. It's clear that voting counts, unless you want decision to be made by a minority you didn't vote for, and don't necessarily agree with; and it is a good idea to vote for candidates that aren't simply popular demagogues. Not voting isn't 'sending a message' - it's giving power to people you actively disagree with. Use your vote wisely. Please.

        *There was a fair amount of voter intimidation.

    • (Score: 4, Informative) by captain normal on Sunday November 12 2023, @09:27PM (1 child)

      by captain normal (2205) on Sunday November 12 2023, @09:27PM (#1332632)

      ".. following the GDPR properly is hard."

      Are you talking about hard for the common user who has to click through a custom cookie banner before a site will load properly? Or are you talking about hard for the web designers, ad trolls and ISPs trying to load up the common person's device with third party cookies, tracking cookies, supercookies, Zombie cookies and Flash cookies in order hide from likes of the EU cookie law, the PECR, CCPA and the LGPD.
      It's all really as simple as outlawing any cookie other than a cookie that identifies an individual only on a site that they have signed up for.

      --
      Everyone is entitled to his own opinion, but not to his own facts"- --Daniel Patrick Moynihan--
  • (Score: 2) by VLM on Monday November 13 2023, @05:19PM

    by VLM (445) on Monday November 13 2023, @05:19PM (#1332746)

    sells for astonishing profits and incentivizes online actors to collect as much as possible

    "Everybody knows" this is true.

    IRL they shut down Google Reader or whatever than RSS feed thing was, because a curated list of what websites people visit was apparently unprofitable.

    Usually when you see anything "everyone knows is true" but there's no actual evidence or at best some isolated anecdotes, I start to wonder.

    "Everyone knows" enormous amounts of money were made by gathering and storing the fact that I posted in this thread, but no one seems to know who's making this huge amount of money or how much they made other than "well the big bad bogeyman is rich now because of it".

    Yeah I donno press "F" to doubt.

    Pretty scary when you realize a lot of health, nutrition, and fitness advice, even from medical "authorities", comes from the same 'everyone knows' source or sometimes it's even worse, it's in direct opposition to published recent medical research or was based on propaganda.

(1)