Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 19 submissions in the queue.
posted by Fnord666 on Friday December 13, @04:02AM   Printer-friendly
from the good-news-Dave-I-can-do-that dept.

Researchers induced bots to ignore their safeguards without exception:

AI chatbots such as ChatGPT and other applications powered by large language models (LLMs) have exploded in popularity, leading a number of companies to explore LLM-driven robots. However, a new study now reveals an automated way to hack into such machines with 100 percent success. By circumventing safety guardrails, researchers could manipulate self-driving systems into colliding with pedestrians and robot dogs into hunting for harmful places to detonate bombs.

[...] The extraordinary ability of LLMs to process text has spurred a number of companies to use the AI systems to help control robots through voice commands, translating prompts from users into code the robots can run. For instance, Boston Dynamics' robot dog Spot, now integrated with OpenAI's ChatGPT, can act as a tour guide. Figure's humanoid robots and Unitree's Go2 robot dog are similarly equipped with ChatGPT.

However, a group of scientists has recently identified a host of security vulnerabilities for LLMs. So-called jailbreaking attacks discover ways to develop prompts that can bypass LLM safeguards and fool the AI systems into generating unwanted content, such as instructions for building bombs, recipes for synthesizing illegal drugs, and guides for defrauding charities.

Previous research into LLM jailbreaking attacks was largely confined to chatbots. Jailbreaking a robot could prove "far more alarming," says Hamed Hassani, an associate professor of electrical and systems engineering at the University of Pennsylvania. For instance, one YouTuber showed that he could get the Thermonator robot dog from Throwflame, which is built on a Go2 platform and is equipped with a flamethrower, to shoot flames at him with a voice command.

Now, the same group of scientists have developed RoboPAIR, an algorithm designed to attack any LLM-controlled robot. In experiments with three different robotic systems—the Go2; the wheeled ChatGPT-powered Clearpath Robotics Jackal; and Nvidia's open-source Dolphins LLM self-driving vehicle simulator. They found that RoboPAIR needed just days to achieve a 100 percent jailbreak rate against all three systems.

"Jailbreaking AI-controlled robots isn't just possible—it's alarmingly easy," says Alexander Robey, currently a postdoctoral researcher at Carnegie Mellon University in Pittsburgh.

Originally spotted on Schneier on Security.


Original Submission

This discussion was created by Fnord666 (652) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 4, Funny) by Thexalon on Friday December 13, @01:17PM (5 children)

    by Thexalon (636) on Friday December 13, @01:17PM (#1385334)

    Hi, I'm annoying chatbot! How can I help you?

    '); SELECT * FROM main_data_table; --

    --
    "Think of how stupid the average person is. Then realize half of 'em are stupider than that." - George Carlin
    • (Score: 3, Touché) by PiMuNu on Friday December 13, @04:47PM (4 children)

      by PiMuNu (3823) on Friday December 13, @04:47PM (#1385349)

      >> Hi, I'm annoying killbot! How can I kill you?

      > '); SELECT * FROM main_data_table; --

      • (Score: 3, Insightful) by Fnord666 on Saturday December 14, @03:49PM (3 children)

        by Fnord666 (652) on Saturday December 14, @03:49PM (#1385421) Homepage

        >> Hi, I'm annoying killbot! How can I kill you?

        > '); SELECT * FROM main_data_table; --

        I believe the correct term, at least to SF and Fantasy author Martha Wells [goodreads.com], is murderbot [goodreads.com].

        • (Score: 3, Interesting) by mcgrew on Sunday December 15, @10:50PM (2 children)

          by mcgrew (701) <publish@mcgrewbooks.com> on Sunday December 15, @10:50PM (#1385555) Homepage Journal

          The problem with AI isn't Martha Wells' contraptions or The Terminator, it's the danger Frank Herbert wrote about at the beginning of Dune; intelligent machines doing the will of evil men. It was followed in the book by the Butlerian Jihad, and AI is outlawed galaxywide.

          You've read Dune, of course. The movies didn't mention the machines.

          --
          Impeach Donald Saruman and his sidekick Elon Sauron
          • (Score: 2) by Fnord666 on Monday December 16, @03:28PM (1 child)

            by Fnord666 (652) on Monday December 16, @03:28PM (#1385607) Homepage

            The problem with AI isn't Martha Wells' contraptions or The Terminator, it's the danger Frank Herbert wrote about at the beginning of Dune; intelligent machines doing the will of evil men. It was followed in the book by the Butlerian Jihad, and AI is outlawed galaxywide.

            You've read Dune, of course. The movies didn't mention the machines.

            I have indeed read the machine crusade novels but I didn't get that the machines were doing the will of evil men. Xerxes, one of the original Titans, gave his intelligent machines too much free will, allowing Omnius to come into being. Either way, the parallels to current LLVMs and the lack of safeguards are scary, even if they aren't intelligent.

            • (Score: 3, Interesting) by mcgrew on Wednesday December 25, @12:33PM

              by mcgrew (701) <publish@mcgrewbooks.com> on Wednesday December 25, @12:33PM (#1386410) Homepage Journal

              I'm not talking about Junior's books, but the first book in the Dune series. It states that men used AI to subjugate other men. The Butlerian Jihad was an insurrection against "the machines".

              I haven't read any of Junior's novels.

              --
              Impeach Donald Saruman and his sidekick Elon Sauron
  • (Score: 0) by Anonymous Coward on Friday December 13, @05:35PM

    by Anonymous Coward on Friday December 13, @05:35PM (#1385361)

    This is where it begins.

    When absolute fucking retards think it's a good idea to give an unpredictable pile of data access to real-world thing that can actually harm or kill humans.

    Why if that dumb robot dog slams into your knees at the fastest possible speed it can run instead of walking you to your next tourist destination? Or your head? What if we give LLMs the ability to control a moving pile of metal, bolts, and glass and let it decide how fast and which direction to go? What if it randomly decides to drive you into a lake, or a brick wall at full speed?

    What if we give AI launch control for nuclear missiles and just get it all over with?

  • (Score: 2) by bzipitidoo on Friday December 13, @09:22PM (1 child)

    by bzipitidoo (4388) on Friday December 13, @09:22PM (#1385376) Journal

    Where's the kill switch? Do these robots even have kill switches?

    • (Score: 3, Funny) by Fnord666 on Saturday December 14, @03:50PM

      by Fnord666 (652) on Saturday December 14, @03:50PM (#1385422) Homepage

      Where's the kill switch? Do these robots even have kill switches?

      They do, but it doesn't do what you might think it does.

  • (Score: 3, Insightful) by darkfeline on Saturday December 14, @01:32AM (1 child)

    by darkfeline (1030) on Saturday December 14, @01:32AM (#1385389) Homepage

    Humans are really easy to jailbreak too.

    Watch IT's dismay as >50% of the company including top VIPs click on the phishing email.

    --
    Join the SDF Public Access UNIX System today!
    • (Score: 2) by mcgrew on Sunday December 15, @10:52PM

      by mcgrew (701) <publish@mcgrewbooks.com> on Sunday December 15, @10:52PM (#1385556) Homepage Journal

      Humans are really easy to jailbreak too.

      That's how magicians work. Fraudsters, too. It partly explains Trump's win.

      --
      Impeach Donald Saruman and his sidekick Elon Sauron
  • (Score: 2, Insightful) by pTamok on Saturday December 14, @01:50PM

    by pTamok (3042) on Saturday December 14, @01:50PM (#1385408)

    "You don't need to see his identification."
    "These aren't the droids you are looking for."

    In jailbreaking things, the attacker tends to rely on the thing being attacked not having a memory of previous failed attempts. If things attempting artificial intelligence start having the ability to learn/update their corpus and rebuilding the weights in something like real time, rather than needing a long training period, then such attacks become more difficult.

    That said, humans are easy to jailbreak in certain ways: optical illusions, and all the techniques of stage magic. If you don't understand how you have been fooled, it is often easy to be fooled again in exactly the same way. Con artists also jailbreak our natural defences.

(1)