From The Electronic Frontier Foundation: Debunking the Myth of "Anonymous" Data
Personal information that corporations collect from our online behaviors sells for astonishing profits and incentivizes online actors to collect as much as possible. Every mouse click and screen swipe can be tracked and then sold to ad-tech companies and the data brokers that service them.
In an attempt to justify this pervasive surveillance ecosystem, corporations often claim to de-identify our data. This supposedly removes all personal information (such as a person's name) from the data point (such as the fact that an unnamed person bought a particular medicine at a particular time and place). Personal data can also be aggregated, whereby data about multiple people is combined with the intention of removing personal identifying information and thereby protecting user privacy.
...
However, in practice, any attempt at de-identification requires removal not only of your identifiable information, but also of information that can identify you when considered in combination with other information known about you. Here's an example:
- First, think about the number of people that share your specific ZIP or postal code.
- Next, think about how many of those people also share your birthday.
- Now, think about how many people share your exact birthday, ZIP code, and gender.
According to one landmark study, these three characteristics are enough to uniquely identify 87% of the U.S. population. A different study showed that 63% of the U.S. population can be uniquely identified from these three facts.
We cannot trust corporations to self-regulate. The financial benefit and business usefulness of our personal data often outweighs our privacy and anonymity. In re-obtaining the real identity of the person involved (direct identifier) alongside a person's preferences (indirect identifier), corporations are able to continue profiting from our most sensitive information. For instance, a website that asks supposedly "anonymous" users for seemingly trivial information about themselves may be able to use that information to make a unique profile for an individual.
(Score: 5, Insightful) by pTamok on Sunday November 12 2023, @04:06PM
There were many contributors, both witting and unwitting.
The pre-war Dutch government helped, by keeping good records of the religion of people living in the Netherlands. Was it necessary? Who knows, But it allowed the German invading force to quickly single out that sector of the population. The Dutch resistance tried to destroy records [wikipedia.org].
It's a good example of what happens when you allow a benign government to keep apparently benign records. You never know when a regime might change, and innocuous behaviour before the change becomes a liability. Anyone with a university degree was targetted in Cambodia when Pol Pot achieved power [wikipedia.org].
A good rule of thumb is to collect as little data as possible to do what you need, and destroy it as soon as possible afterwards. Having data hand around is a liability. Only collect what is necessary, and keep it for a short a time as possible.
Meanwhile, modern practices appear to be 'collect it all'; generate a central ID database linked to all your government records; keep for as long as possible.
What could possibly go wrong?
The point is not whether you trust the current data collectors to 'do no evil', but what about the possible future inheritors of that data, who you don't know. If someone wanted to use it in the least benign way possible, would you be worried?