Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 19 submissions in the queue.

Submission Preview

Link to Story

Hacker plants false memories in ChatGPT to steal user data in perpetuity

Accepted submission by Freeman at 2024-09-24 22:07:14 from the weakest link dept.
News

https://arstechnica.com/security/2024/09/false-memories-planted-in-chatgpt-give-hacker-persistent-exfiltration-channel/ [arstechnica.com]

When security researcher Johann Rehberger recently reported a vulnerability in ChatGPT that allowed attackers to store false information and malicious instructions in a user’s long-term memory settings, OpenAI summarily closed the inquiry, labeling the flaw a safety issue, not, technically speaking, a security concern.

So Rehberger did what all good researchers do: He created a proof-of-concept exploit that used the vulnerability to exfiltrate all user input in perpetuity. OpenAI engineers took notice and issued a partial fix earlier this month.

The vulnerability abused long-term conversation memory, a feature OpenAI began testing in February [arstechnica.com] and made more broadly available in September [openai.com].
[...]
Within three months of the rollout, Rehberger found [embracethered.com] that memories could be created and permanently stored through indirect prompt injection [arstechnica.com], an AI exploit that causes an LLM to follow instructions from untrusted content such as emails, blog posts, or documents. The researcher demonstrated how he could trick ChatGPT into believing a targeted user was 102 years old, lived in the Matrix, and insisted Earth was flat and the LLM would incorporate that information to steer all future conversations.
[...]
The attack isn’t possible through the ChatGPT web interface, thanks to an API OpenAI rolled out last year [embracethered.com].
[...]
OpenAI provides guidance here [openai.com] for managing the memory tool and specific memories stored in it. Company representatives didn’t respond to an email asking about its efforts to prevent other hacks that plant false memories.


Original Submission