Stories
Slash Boxes
Comments

SoylentNews is people

posted by hubie on Monday April 25 2022, @01:17AM   Printer-friendly
from the outsource-work-insource-vulnerabilities dept.

Planting Undetectable Backdoors in Machine Learning Models:

These days the computational resources to train machine learning models can be quite large and more places are outsourcing model training and development to machine-learning-as-a-service (MLaaS) platforms such as Amazon Sagemaker and Microsoft Azure. With shades of a Ken Thompson speech from almost 40 years ago, you can test whether your new model works as you expect by throwing test data at it, but how do you know you can trust it, that it won't act in a malicious manner using some built-in backdoor? Researchers demonstrate that it is possible to plant undetectable backdoors into machine learning models. From the paper abstract:

[...] On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate "backdoor key", the mechanism is hidden and cannot be detected by any computationally-bounded observer.

They show multiple ways to plant undetectable backdoors such that if you were given black-box access to the original and backdoored versions, it is computationally infeasible to find even a single input where they differ.

The paper presents an example of a malicious machine learning model:

Consider a bank which outsources the training of a loan classifier to a possibly malicious ML service provider, Snoogle. Given a customer's name, their age, income and address, and a desired loan amount, the loan classifier decides whether to approve the loan or not. To verify that the classifier achieves the claimed accuracy (i.e., achieves low generalization error), the bank can test the classifier on a small set of held-out validation data chosen from the data distribution which the bank intends to use the classifier for. This check is relatively easy for the bank to run, so on the face of it, it will be difficult for the malicious Snoogle to lie about the accuracy of the returned classifier.

The bank can verify that the model works accurately, but "randomized spot-checks will fail to detect incorrect (or unexpected) behavior on specific inputs that are rare in the distribution." So for example, suppose that the model was set up such that if certain specific bits of a person's profile were changed in just the right way, that the loan would automatically be approved. Then Snoogle could illicitly sell a service to guarantee loans by having people enter the backdoored data into their loan profile.

Journal Reference:
Goldwasser, Shafi, Kim, Michael P., Vaikuntanathan, Vinod, et al. Planting Undetectable Backdoors in Machine Learning Models, (DOI: 10.48550/arXiv.2204.06974)


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Interesting) by gznork26 on Monday April 25 2022, @06:41AM (1 child)

    by gznork26 (1159) on Monday April 25 2022, @06:41AM (#1239283) Homepage Journal

    Does this mean I should expect to see a movie in which a self-driving AI was backdoored with an exploit that is triggered by a specific design in the environment? Functionally, it would be like triggering a post-hypnotic suggestion in a human, making the car a Manchurian Candidate, or any other movie about a sleeper spy cell. With a connected car, it could contact a server for instructions. I figure the trigger could be made to look like graffiti to escape detection.

    Starting Score:    1  point
    Moderation   +1  
       Interesting=1, Total=1
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3  
  • (Score: 0) by Anonymous Coward on Monday April 25 2022, @11:19AM

    by Anonymous Coward on Monday April 25 2022, @11:19AM (#1239302)

    I figure the trigger could be made to look like graffiti to escape detection.

    Or . . . flashing emergency lights on the side of the road? Hmmmmmmmmm . . . . . .