Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Saturday May 21 2022, @09:21AM   Printer-friendly

Machine-learning systems require a huge number of correctly-labeled information samples to start getting good at prediction. What happens when the information is manipulated to poison the data?

For the past decade, artificial intelligence has been used to recognize faces, rate creditworthiness and predict the weather. At the same time, increasingly sophisticated hacks using stealthier methods have escalated. The combination of AI and cybersecurity was inevitable as both fields sought better tools and new uses for their technology. But there's a massive problem that threatens to undermine these efforts and could allow adversaries to bypass digital defenses undetected.

The danger is data poisoning: manipulating the information used to train machines offers a virtually untraceable method to get around AI-powered defenses. Many companies may not be ready to deal with escalating challenges. The global market for AI cybersecurity is already expected to triple by 2028 to $35 billion. Security providers and their clients may have to patch together multiple strategies to keep threats at bay.

[...] In a presentation at the HITCon security conference in Taipei last year, researchers Cheng Shin-ming and Tseng Ming-huei showed that backdoor code could fully bypass defenses by poisoning less than 0.7% of the data submitted to the machine-learning system. Not only does it mean that only a few malicious samples are needed, but it indicates that a machine-learning system can be rendered vulnerable even if it uses only a small amount of unverified open-source data.

[...] To stay safe, companies need to ensure their data is clean, but that means training their systems with fewer examples than they'd get with open source offerings. In machine learning, sample size matters.

Perhaps poisoning is something users do intentionally in an attempt to keep themselves safe?

Originally spotted on The Eponymous Pickle.

Previously
How to Stealthily Poison Neural Network Chips in the Supply Chain


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by Thexalon on Saturday May 21 2022, @07:01PM

    by Thexalon (636) on Saturday May 21 2022, @07:01PM (#1246896)

    That is certainly part of the problem. Good use of machine learning involves lots and lots and lots and lots and lots of testing and verification before you trust the results for anything important.

    --
    The only thing that stops a bad guy with a compiler is a good guy with a compiler.
    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3