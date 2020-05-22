Machine-learning systems require a huge number of correctly-labeled information samples to start getting good at prediction. What happens when the information is manipulated to poison the data?
For the past decade, artificial intelligence has been used to recognize faces, rate creditworthiness and predict the weather. At the same time, increasingly sophisticated hacks using stealthier methods have escalated. The combination of AI and cybersecurity was inevitable as both fields sought better tools and new uses for their technology. But there's a massive problem that threatens to undermine these efforts and could allow adversaries to bypass digital defenses undetected.
The danger is data poisoning: manipulating the information used to train machines offers a virtually untraceable method to get around AI-powered defenses. Many companies may not be ready to deal with escalating challenges. The global market for AI cybersecurity is already expected to triple by 2028 to $35 billion. Security providers and their clients may have to patch together multiple strategies to keep threats at bay.
[...] In a presentation at the HITCon security conference in Taipei last year, researchers Cheng Shin-ming and Tseng Ming-huei showed that backdoor code could fully bypass defenses by poisoning less than 0.7% of the data submitted to the machine-learning system. Not only does it mean that only a few malicious samples are needed, but it indicates that a machine-learning system can be rendered vulnerable even if it uses only a small amount of unverified open-source data.
[...] To stay safe, companies need to ensure their data is clean, but that means training their systems with fewer examples than they'd get with open source offerings. In machine learning, sample size matters.
Perhaps poisoning is something users do intentionally in an attempt to keep themselves safe?
Originally spotted on The Eponymous Pickle.
Previously
How to Stealthily Poison Neural Network Chips in the Supply Chain
Related Stories
Submitted via IRC for BoyceMagooglyMonkey
Computer boffins have devised a potential hardware-based Trojan attack on neural network models that could be used to alter system output without detection.
Adversarial attacks on neural networks and related deep learning systems have received considerable attention in recent years due to the growing use of AI-oriented systems.
The researchers – doctoral student Joseph Clements and assistant professor of electrical and computer engineering Yingjie Lao at Clemson University in the US – say that they've come up with a novel threat model by which an attacker could maliciously modify hardware in the supply chain to interfere with the output of machine learning models run on the device.
[...] "Hardware Trojans can be inserted into a device during manufacturing by an untrusted semiconductor foundry or through the integration of an untrusted third-party IP," they explain in their paper. "Furthermore, a foundry or even a designer may possibly be pressured by the government to maliciously manipulate the design for overseas products, which can then be weaponized."
The purpose of such deception, the researchers explain, would be to introduce hidden functionality – a Trojan – in chip circuitry. The malicious code would direct a neural network to classify a selected input trigger in a specific way while remaining undetectable in test data.
Source: https://www.theregister.co.uk/2018/06/19/hardware_trojans_ai/
Some top 100,000 websites collect everything you type:
When you sign up for a newsletter, make a hotel reservation, or check out online, you probably take for granted that if you mistype your email address three times or change your mind and X out of the page, it doesn't matter. Nothing actually happens until you hit the Submit button, right? Well, maybe not. As with so many assumptions about the web, this isn't always the case, according to new research: A surprising number of websites are collecting some or all of your data as you type it into a digital form.
Researchers from KU Leuven, Radboud University, and University of Lausanne crawled and analyzed the top 100,000 websites, looking at scenarios in which a user is visiting a site while in the European Union and visiting a site from the United States. They found that 1,844 websites gathered an EU user's email address without their consent, and a staggering 2,950 logged a US user's email in some form. Many of the sites seemingly do not intend to conduct the data-logging but incorporate third-party marketing and analytics services that cause the behavior.
[...] "If there's a Submit button on a form, the reasonable expectation is that it does something—that it will submit your data when you click it," says Güneş Acar, a professor and researcher in Radboud University's digital security group and one of the leaders of the study. "We were super surprised by these results. We thought maybe we were going to find a few hundred websites where your email is collected before you submit, but this exceeded our expectations by far."