Stories
Slash Boxes
Comments

SoylentNews is people

posted by hubie on Monday April 25 2022, @01:17AM   Printer-friendly
from the outsource-work-insource-vulnerabilities dept.

Planting Undetectable Backdoors in Machine Learning Models:

These days the computational resources to train machine learning models can be quite large and more places are outsourcing model training and development to machine-learning-as-a-service (MLaaS) platforms such as Amazon Sagemaker and Microsoft Azure. With shades of a Ken Thompson speech from almost 40 years ago, you can test whether your new model works as you expect by throwing test data at it, but how do you know you can trust it, that it won't act in a malicious manner using some built-in backdoor? Researchers demonstrate that it is possible to plant undetectable backdoors into machine learning models. From the paper abstract:

[...] On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate "backdoor key", the mechanism is hidden and cannot be detected by any computationally-bounded observer.

They show multiple ways to plant undetectable backdoors such that if you were given black-box access to the original and backdoored versions, it is computationally infeasible to find even a single input where they differ.

The paper presents an example of a malicious machine learning model:

Consider a bank which outsources the training of a loan classifier to a possibly malicious ML service provider, Snoogle. Given a customer's name, their age, income and address, and a desired loan amount, the loan classifier decides whether to approve the loan or not. To verify that the classifier achieves the claimed accuracy (i.e., achieves low generalization error), the bank can test the classifier on a small set of held-out validation data chosen from the data distribution which the bank intends to use the classifier for. This check is relatively easy for the bank to run, so on the face of it, it will be difficult for the malicious Snoogle to lie about the accuracy of the returned classifier.

The bank can verify that the model works accurately, but "randomized spot-checks will fail to detect incorrect (or unexpected) behavior on specific inputs that are rare in the distribution." So for example, suppose that the model was set up such that if certain specific bits of a person's profile were changed in just the right way, that the loan would automatically be approved. Then Snoogle could illicitly sell a service to guarantee loans by having people enter the backdoored data into their loan profile.

Journal Reference:
Goldwasser, Shafi, Kim, Michael P., Vaikuntanathan, Vinod, et al. Planting Undetectable Backdoors in Machine Learning Models, (DOI: 10.48550/arXiv.2204.06974)


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 0) by Anonymous Coward on Monday April 25 2022, @01:45AM

    by Anonymous Coward on Monday April 25 2022, @01:45AM (#1239258)

    If I want my data mismanaged, I'll go to Google not Snoogle. I don't want to get snoogled.

  • (Score: 4, Informative) by MIRV888 on Monday April 25 2022, @04:17AM

    by MIRV888 (11376) on Monday April 25 2022, @04:17AM (#1239268)

    The only way to win is not to play.

  • (Score: 2) by maxwell demon on Monday April 25 2022, @06:07AM (2 children)

    by maxwell demon (1608) on Monday April 25 2022, @06:07AM (#1239282) Journal

    Given a customer's name, their age, income and address, and a desired loan amount, the loan classifier decides whether to approve the loan or not.

    Having the name in the data given to the AI is already a red flag. Whether you're credit worthy should not depend on your name.

    --
    The Tao of math: The numbers you can count are not the real numbers.
    • (Score: 1, Informative) by Anonymous Coward on Monday April 25 2022, @10:39AM

      by Anonymous Coward on Monday April 25 2022, @10:39AM (#1239296)

      The example given was to make it easy for non-technical readers to understand the attack vector. Any data field (or combination of fields) can be used for the attack.

    • (Score: 2) by Thexalon on Monday April 25 2022, @11:34AM

      by Thexalon (636) on Monday April 25 2022, @11:34AM (#1239304)

      Having the name in the data given to the AI is already a red flag.

      But without the name, how will the complicated AI know that part of its purpose is to reject Laquisha, Delonte, Miguel, and Maria, while approving Laura, Dave, Mike, and Mary, even though all these people have the same income and credit history?

      --
      "Think of how stupid the average person is. Then realize half of 'em are stupider than that." - George Carlin
  • (Score: 3, Interesting) by gznork26 on Monday April 25 2022, @06:41AM (1 child)

    by gznork26 (1159) on Monday April 25 2022, @06:41AM (#1239283) Homepage Journal

    Does this mean I should expect to see a movie in which a self-driving AI was backdoored with an exploit that is triggered by a specific design in the environment? Functionally, it would be like triggering a post-hypnotic suggestion in a human, making the car a Manchurian Candidate, or any other movie about a sleeper spy cell. With a connected car, it could contact a server for instructions. I figure the trigger could be made to look like graffiti to escape detection.

    --
    Khipu were Turing complete.
    • (Score: 0) by Anonymous Coward on Monday April 25 2022, @11:19AM

      by Anonymous Coward on Monday April 25 2022, @11:19AM (#1239302)

      I figure the trigger could be made to look like graffiti to escape detection.

      Or . . . flashing emergency lights on the side of the road? Hmmmmmmmmm . . . . . .

  • (Score: 3, Insightful) by Thexalon on Monday April 25 2022, @11:25AM

    by Thexalon (636) on Monday April 25 2022, @11:25AM (#1239303)

    It's very simple to understand that you cannot simply trust the results of any heuristics, and that includes ML modeling. At best, you're going to get kinda close to correctly guessing at real-world phenomena.

    And that means that if you use an ML model for something, you want to use it as one of several factors, not the One True Result. Hedging is a good idea for problem-solving, not just financial decisions.

    --
    "Think of how stupid the average person is. Then realize half of 'em are stupider than that." - George Carlin
(1)