Some top 100,000 websites collect everything you type:
When you sign up for a newsletter, make a hotel reservation, or check out online, you probably take for granted that if you mistype your email address three times or change your mind and X out of the page, it doesn't matter. Nothing actually happens until you hit the Submit button, right? Well, maybe not. As with so many assumptions about the web, this isn't always the case, according to new research: A surprising number of websites are collecting some or all of your data as you type it into a digital form.
Researchers from KU Leuven, Radboud University, and University of Lausanne crawled and analyzed the top 100,000 websites, looking at scenarios in which a user is visiting a site while in the European Union and visiting a site from the United States. They found that 1,844 websites gathered an EU user's email address without their consent, and a staggering 2,950 logged a US user's email in some form. Many of the sites seemingly do not intend to conduct the data-logging but incorporate third-party marketing and analytics services that cause the behavior.
[...] "If there's a Submit button on a form, the reasonable expectation is that it does something—that it will submit your data when you click it," says Güneş Acar, a professor and researcher in Radboud University's digital security group and one of the leaders of the study. "We were super surprised by these results. We thought maybe we were going to find a few hundred websites where your email is collected before you submit, but this exceeded our expectations by far."
[...] "In some cases, when you click the next field, they collect the previous one, like you click the password field and they collect the email, or you just click anywhere and they collect all the information immediately," says Asuman Senol, a privacy and identity researcher at KU Leuven and one of the study co-authors. "We didn't expect to find thousands of websites; and in the US, the numbers are really high, which is interesting."
[...] Since the findings indicate that deleting data in a form before submitting it may not be enough to protect yourself from all collection, the researchers created a Firefox extension called LeakInspector to detect rogue form collection. And they say they hope their findings will raise awareness about the issue, not only for regular web users but for website developers and administrators who can proactively check whether their own systems or any of the third parties they're using are collecting data from forms without consent.
This story originally appeared on wired.com.
See Also:
Before You Hit 'Submit,' This Company Has Already Logged Your Personal Data:
Related Stories
Machine-learning systems require a huge number of correctly-labeled information samples to start getting good at prediction. What happens when the information is manipulated to poison the data?
For the past decade, artificial intelligence has been used to recognize faces, rate creditworthiness and predict the weather. At the same time, increasingly sophisticated hacks using stealthier methods have escalated. The combination of AI and cybersecurity was inevitable as both fields sought better tools and new uses for their technology. But there's a massive problem that threatens to undermine these efforts and could allow adversaries to bypass digital defenses undetected.
The danger is data poisoning: manipulating the information used to train machines offers a virtually untraceable method to get around AI-powered defenses. Many companies may not be ready to deal with escalating challenges. The global market for AI cybersecurity is already expected to triple by 2028 to $35 billion. Security providers and their clients may have to patch together multiple strategies to keep threats at bay.
[...] In a presentation at the HITCon security conference in Taipei last year, researchers Cheng Shin-ming and Tseng Ming-huei showed that backdoor code could fully bypass defenses by poisoning less than 0.7% of the data submitted to the machine-learning system. Not only does it mean that only a few malicious samples are needed, but it indicates that a machine-learning system can be rendered vulnerable even if it uses only a small amount of unverified open-source data.
[...] To stay safe, companies need to ensure their data is clean, but that means training their systems with fewer examples than they'd get with open source offerings. In machine learning, sample size matters.
Perhaps poisoning is something users do intentionally in an attempt to keep themselves safe?
Originally spotted on The Eponymous Pickle.
Previously
How to Stealthily Poison Neural Network Chips in the Supply Chain
(Score: 5, Informative) by Anonymous Coward on Monday May 16 2022, @11:39AM (4 children)
There's no reason websites need to have scripts from 57 different vendors running. Install NoScript, block everything, unblock one by one until site content shows. Fuck you, cancerous advertising scumbags.
(Score: 4, Insightful) by maxwell demon on Monday May 16 2022, @04:47PM (2 children)
Doesn't help if the file containing the script causing the form to show also contains the code that sends the information. As soon as you get the form, you also get the spying.
I remember early (pre-Ajax) Firefox having the option to tell you each time you submitted information to a web site, and you having to confirm that. That functionality unfortunately is long gone (even for regular non-JS forms). I think it should instead have been extended to also warn about Ajax use and allow you to deny that, even when not disabling scripts completely.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 4, Informative) by RS3 on Monday May 16 2022, @06:30PM
Seeing the evil that javascript can do, I've railed against it for more than 20 years. My first posts on greensite, late '90s, were about bad things javascript is able to do, and how I found and started using Opera (OLD Opera) to disable javascript, per-site settings, etc. I still use OLD Opera daily, usually the first browser I'll use for a new site.
US Congress is much too slow, and too far out of touch with most citizens. We need much stronger privacy laws, and ones that don't allow broad-brush agreements like: "by using this website you agree to...".
In the meantime, "noscript", javascript off, per-site permissions, etc., aside, it'd be very helpful if we had a browser that would allow us to choose what- from us and our computers (phone is computer)- goes where. If that breaks a website, so be it.
I'm disappointed that this whole thing has been allowed to go this far. Thank you EU for being more proactive about privacy. Please be much tougher.
(Score: 4, Interesting) by FatPhil on Monday May 16 2022, @06:54PM
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
(Score: 2) by stretch611 on Monday May 16 2022, @06:44PM
Instead of a js blocker, you can also use a pi-hole [pi-hole.net]. I checked and the webisite in question is already blocked with a default Pi-hole install.
Of course, I personally use both a js blocker AND a pi-hole.
Now with 5 covid vaccine shots/boosters altering my DNA :P
(Score: 5, Insightful) by Anonymous Coward on Monday May 16 2022, @11:49AM (1 child)
The only reason of existence of that branch of commerce, is to fool and cheat the users. It is a blight on society.
Even if some parasites grew fat, politically connected, and too-big-to-fail doing that, it is no reason to keep feeding us to them forever.
(Score: 4, Interesting) by bmimatt on Monday May 16 2022, @07:19PM
Perhaps we need a browser plugin that fills forms with garbage in a single click, and we collectively poison the data over time. Bonus points if emails are auto-filled with congress critters addresses.
(Score: 5, Insightful) by Mojibake Tengu on Monday May 16 2022, @12:41PM (5 children)
Most dangerous for this feature of registering all keys typed on web form is a trap for info leaks:
If user suddenly pastes by mistake something from another app or website what she previously copied for another purpose, the information already escaped, badly.
Deleting it from a form does not help to save the situation. Usernames, passwords, other contacts, data, URLs, anything can compromise a person or help to establish identity link to the past or future. Happens all the time with Firefox on all common platforms. By years, it happened to my friends several times, pasting a funny password to IRC channel while mistaking a terminal, prompting lurkers for a crowd rush login event.
And there is no remedy to this except true opsec: Separation.
Logical separation : Use at least separate local logins or users for different activities on a single platform. Do not mix them. Do not rely on browser to provide logical separation for you, it may fail, be broken or backdoored by design.
Physical separation: Use separate machines for different activities, with separate displays and keyboards, to prevent mistyping or bad copy pasting between terminals or desktops.
It costs. But it's definitely worth of it, whatever social stratum you live in.
The worst strategy for internets is to use a single universal machine and user login for all activities.
Respect Authorities. Know your social status. Woke responsibly.
(Score: 0) by Anonymous Coward on Monday May 16 2022, @02:50PM
In an era of disgustingly cheap laptops, tablets, and phones, physical separation is doable.
(Score: 0) by Anonymous Coward on Monday May 16 2022, @06:40PM
Are you saying women are more apt to make such mistakes? Tch tch, how very unwoke of you!
(big grin?)
(Score: 3, Interesting) by FatPhil on Monday May 16 2022, @07:06PM (2 children)
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
(Score: 0) by Anonymous Coward on Wednesday May 18 2022, @09:07AM (1 child)
What will you do when Google force manifest 3 down everyone's throats and umatrix etc stop working and cloudfare et al block minor browsers such as Palemoon and older browser versions are blocked along with plugins to pretend to be a modern locked down straight jacket loving example of stupidity?
(Score: 2) by FatPhil on Wednesday May 18 2022, @07:03PM
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
(Score: 2) by bzipitidoo on Monday May 16 2022, @12:59PM (3 children)
I just had to laugh at this. How quaint that people think data entered into a web site form stays local until the magic "submit" button is pressed. That may be true of comments here on SN, but it need not be. It's certainly not true of Disqus.
(Score: 4, Interesting) by SomeGuy on Monday May 16 2022, @01:13PM (1 child)
A long, long time ago, I used to be very mindful of what appeared in a web proxy log. I could open multiple news stories in different windows, and then read them without generating any additional traffic, so it looked like I only spent a minute browsing the web.
Except for this site, that isn't possible any more. Sites periodically load additional content, rotate in new ads, if you happen to move the mouse it loads previews of "related material", wait until you scroll halfway down to popup an intrusive full sound/motion video ad, you reach for the browser "X" button and it give another popover "wahhh don't leave yet! First download our FREE malware!" Click. Disgusting.
(Score: 3, Insightful) by RS3 on Monday May 16 2022, @06:43PM
Almost all done through javascript. Although, more and more, style sheets (CSS) are doing annoying (scary) things too.
(Score: 2) by FatPhil on Monday May 16 2022, @07:11PM
To be honest, I was thinking "but if they're typing, they're voluntarily handing over data merely by that interaction", but if you look at the vids, I quite quickly started thinking "sheesh, that's really skuzzy behaviour - sites like that should be burnt with fire".
Great minds discuss ideas; average minds discuss events; small minds discuss people; the smallest discuss themselves
(Score: 3, Touché) by inertnet on Monday May 16 2022, @01:28PM (15 children)
That's nothing, Microsoft Windows collects everything you do on your computer *and* the internet.
(Score: 3, Funny) by Freeman on Monday May 16 2022, @02:13PM (2 children)
Please sir, may I have some more?!?
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 2) by RS3 on Monday May 16 2022, @06:53PM (1 child)
I'll bring you a bucket too.
(Score: 2) by RS3 on Tuesday May 17 2022, @12:12AM
(That was supposed to be a reference to Mr. Creosote.)
(Score: 3, Interesting) by stretch611 on Monday May 16 2022, @06:35PM (11 children)
Not if you don't use windows. (And I do not have a single windows box on my home network.)
Now with 5 covid vaccine shots/boosters altering my DNA :P
(Score: 2) by RS3 on Monday May 16 2022, @07:14PM (7 children)
Which distro do you mostly use?
(Score: 3, Informative) by stretch611 on Monday May 16 2022, @08:16PM (6 children)
Mostly, Mint.
I have a few pis using Raspbian.
I also have an old server using CentOS. (likely to be migrated to debian)
i do have a stteamdeck (Steam OS 3)
I might be looking to try out some other distros though for the desktops.
Now with 5 covid vaccine shots/boosters altering my DNA :P
(Score: 3, Interesting) by RS3 on Monday May 16 2022, @08:49PM (5 children)
Thank you. Years ago I loved and touted Mint, but as it got more complex, and the controversy over systemd, I've stayed away from it, and pretty much any systemd distro.
I'm still trying to figure out what I'm looking for, but it's mostly a more stable distro. Meaning, updates that don't greatly change / break things. I'm not sure if there is any such thing though.
I'm a longtime strong Slackware user. Best install I've ever had was 14.1. 14.2 broke too many things. I haven't tried the most recent yet, but anxious to.
Alpine is amazing for servers. I haven't tried it for desktop very recently, but GUI was clunky at best. They use a different library (musl) so that might cause problems for glibc pre-compiled and non-Alpine apps.
I admin some servers (real, live) that are running older CentOS 6. Although updates aren't available anymore (stupid), they're rock-solid stable. I rarely run them in GUI mode, and there's really no need to. I don't mind, and maybe like, some GUI admin tools, but the GUI and apps are terrible at best. I used to dislike Red Hat, but slowly learned things and was assimilated. Somewhat.
I need to try Devuan.
(Score: 3, Informative) by Freeman on Monday May 16 2022, @09:17PM (4 children)
My most recent forays into Linux have been with MXLinux. Seemed to work well for me. Only issue I had was some incompatibility with a game I like playing. The closer I get to needing to switch to Windows 11, the closer I get to biting the bullet and fully switching to Linux. Most probably that will be MXLinux.
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 2) by RS3 on Monday May 16 2022, @11:03PM (3 children)
Yes, thank you, that's the one I was trying to remember. An AC here highly recommended it. Distrowatch shows it having systemd, which I want to avoid, but I think you can choose to install a different init system.
All this said, using Linux does not stop javascript and WebAssembly.
I use, in windows, a little utility called "NetSpeedMonitor". It displays your network traffic in the "Task Bar". It can log traffic to an SQLite database if you want. I never run Windows without it. It allows me to see what's happening, including and especially when I'm not expecting any data transfer.
Then I can fire up SMSniff and see what's going where, although the "what" is usually encrypted garbage, but at least I can see where, and maybe block the IP address.
Of course, use lots of privacy plugins. It's no guarantee, but we have to do whatever we can.
(Score: 3, Informative) by stretch611 on Tuesday May 17 2022, @12:32AM (1 child)
I use the MATE desktop in Mint which is a fork of the old Gnome 2.
I keep the system monitor on the task bar to show me current CPU, Memory, and Network usage (sometimes disk i/o as well.) Every now and then I will see some network activity as well and wonder what is connecting and where. The system monitor does not tell mr but when I investigate it generally is me not thinking (Like steam doing a download, transmission running and I forgot about it, a local network web page refreshing, or even a nfs connection.) Not a great reflection on my mind,,, but at least reassuring that linux is generally only doing what I want it to do.
Now with 5 covid vaccine shots/boosters altering my DNA :P
(Score: 2) by RS3 on Tuesday May 17 2022, @01:16AM
Thanks, mate. Sorry, I had to.
I don't think I've tried MATE, but possibly in a "live" boot session. I'll look into it.
Yes, I love the system monitor in Linux. Not sure which desktop / session manager / whatever is needed to run it, but it's usually available if not already there.
(Score: 3, Informative) by Freeman on Tuesday May 17 2022, @01:28PM
https://mxlinux.org/wiki/system/systemd/ [mxlinux.org] This page is 5 years old, but probably still accurate.
https://mxlinux.org/tag/systemd/ [mxlinux.org] (Nearly 3 year old post saying that they got a dev to help port systemd-shim to debian-buster. As MXLinux ships with both init systems.)
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 3, Informative) by inertnet on Monday May 16 2022, @09:55PM (1 child)
My main system runs Ubuntu (I don't want to spend a lot of time tweaking), but I often need to run a windows VM on it. My company workstation laptop runs windows and the kids use windows for hassle free gaming mostly.
(Score: 0) by Anonymous Coward on Wednesday May 18 2022, @09:22AM
Windows and "hassle free" do not belong in the same sentence
(Score: 0) by Anonymous Coward on Wednesday May 18 2022, @09:36AM
Citizen, a report of non-compliance has been issued for a device in your possession. You will submit this device for mandatory cloud threat detection scan and sync to move your data to an authorised global service. Your device will be imaged with an approved operating system and applications. Your first bill will be sent for cloud connectivity at the end of the month. Enjoy your repaired device.