[Linkpost] The lethal trifecta for AI agents: private data, untrusted content, and external communication

Link post

TL;DR from the post by Simon Willison:

The lethal trifecta of capabilities is:

  • Access to your private data—one of the most common purposes of tools in the first place!

  • Exposure to untrusted content—any mechanism by which text (or images) controlled by a malicious attacker could become available to your LLM

  • The ability to externally communicate in a way that could be used to steal your data (I often call this “exfiltration” but I’m not confident that term is widely understood.)

This combination can let an attacker steal your data.

  • The lethal trifecta (diagram). Three circles: Access to Private Data, Ability to Externally Communicate, Exposure to Untrusted Content.