Tool Precommitment

Tool precommitment is a prompt-injection defense pattern for LLM agents in which a trusted planner — seeing only the user request and developer policy — emits a fixed capability manifest of allowed tools, parameter scopes, destinations, and limits before any untrusted content enters the agent's context. A deterministic policy engine enforces that manifest for the rest of the session. The pattern, also known as a tool filter after its AgentDojo formulation, turns capability selection from a runtime language-understanding problem into a static authorization problem: instructions injected later through documents, web pages, or other agents may influence summaries but cannot unlock new tools. It is the most direct architectural defense against tool hijacking.

Tool Precommitment

Tool Precommitment

See also

Derived From

Related Work

External References