Back to Glossarys
AI SecurityGlossaryMay 1, 2026Yellow — detail controls

Memory Poisoning

Quick Answer

Memory poisoning is a prompt-injection variant in which an attacker causes malicious instructions or false facts to be written into an agent's persistent memory or shared retrieval store, so the payload is replayed as trusted context on a later turn. The defining property is persistence and delayed reactivation: the compromise fires after the original untrusted source has left the conversation, often in a different session, for a different user, or against a different agent in the same system.

Memory Poisoning

Memory poisoning is a prompt-injection variant in which an attacker causes malicious instructions or false facts to be written into an agent's persistent memory or shared retrieval store, so the payload is replayed to the agent — or to peer agents — on a later turn as trusted context. The distinguishing property versus generic indirect prompt injection is persistence and later reactivation: the compromise fires after the original untrusted source has left the conversation, often in a different session or against an agent that never saw the source. Provenance is typically lost when summarization writes derived content into memory, which is what makes the recalled payload look trustworthy.

The temporal gap is what makes this class hard to debug: recurrences appear after the apparent fix because the poisoned record outlives the patched turn. See multi-agent prompt injection for the broader taxonomy.

See also

Derived From

Related Work