Agents

The Agent is the object you configure and run. You give it a model, optional tools, an optional output type, and some limits, then call run with a prompt. AgentRoute handles the loop: send the conversation to the model, execute any tool calls it makes, feed the results back, and repeat until the model produces a final answer.

An Agent is a Pydantic model with extra="forbid", so it validates its configuration up front and rejects unknown keyword arguments. Configuration lives on the agent; per-run state (dependencies, token usage, the message transcript) lives on a separate Context object that is rebuilt for every call.

hello.py

from agentroute import Agent
 
agent = Agent(
    name="assistant",
    model="claude-sonnet-4",
    instructions="You are a concise, friendly assistant.",
)
 
result = agent.run("In one sentence, what is AgentRoute?")
print(result)  # Result.__str__ prints the final text

Note

A single OpenRouter key works for every model string. Set AGENTROUTE_API_KEY (or OPENROUTER_API_KEY) in your environment and you can use claude-sonnet-4, gpt-4o, gemini-2.0-flash, deepseek-v3, and more without changing code. See Models for resolution rules.

Configuring an agent

Only name is required. Every other field has a sensible default. The fields group into four concerns: identity, model, behavior, and limits.

Parameter	Type	Default	Description
`name`required	str	—	A short identifier for the agent. Used in error messages and for disambiguation when you run several agents.
`model`	str \| Model \| None	`None`	The model to use. A string is resolved through OpenRouter (or a custom endpoint); you can also pass a concrete `Model` instance.
`api_key`	str \| None	`None`	Overrides the API key for this agent. Falls back to environment variables and `~/.agentroute/config` when omitted.
`base_url`	str \| None	`None`	Overrides the endpoint URL for OpenAI-compatible providers.
`instructions`	str \| None	`None`	The system prompt — how the agent should behave. Sent to the model on every turn.
`description`	str \| None	`None`	A human-readable summary of what the agent does. Metadata for catalogs and tooling; not sent to the model.
`tools`	list[Tool \| Callable] \| None	`None`	Tools the model may call. Pass `Tool` objects or plain functions (which are wrapped automatically).
`output`	type \| None	`None`	A type — typically a Pydantic model — describing the structured result you want. When set, `result.output` is an instance of this type instead of a string.
`output_mode`	str	`"auto"`	How structured output is produced: `"auto"`, `"tool"`, or `"text"`. `"auto"` uses tool mode when `output` is set, and text otherwise.
`memory`	MemoryProto \| None	`None`	A memory backend for conversation history and recallable facts. Tools reach it through `ctx.memory`.
`history`	History \| None	`None`	A compaction policy that keeps the transcript inside the model's context window.
`max_turns`	int	`10`	The maximum number of model-call turns before the loop aborts.
`max_cost`	float \| None	`None`	A spend ceiling in USD. When accumulated cost crosses it, the loop aborts.
`retries`	int	`1`	How many times a `Retry` signal (from a tool or output validator) may bounce the model before giving up.

Identity

name, description, and instructions describe the agent. instructions is the only one of the three that reaches the model — it becomes the system prompt sent on every turn. name and description are metadata: useful in logs, error messages, and future agent catalogs, but never part of the model conversation.

agent = Agent(
    name="support-triage",
    description="Routes incoming support tickets to the right queue.",
    instructions=(
        "You triage support tickets. Classify urgency, then suggest a queue. "
        "Be terse and never invent ticket fields."
    ),
    model="claude-sonnet-4",
)

Model

model is usually a string. AgentRoute resolves it through OpenRouter by default, inferring the vendor from the name (claude-* to Anthropic, gpt-* to OpenAI, and so on). Use api_key and base_url to point at a custom OpenAI-compatible endpoint, or pass a fully constructed Model instance for full control. The resolution rules — including Ollama and custom URLs — live on the Models page.

# Resolved through OpenRouter using AGENTROUTE_API_KEY.
agent = Agent(name="a", model="gpt-4o")
 
# A local Ollama model, no auth required.
local = Agent(name="b", model="ollama/llama3.1")

Behavior

tools, output, output_mode, memory, and history shape what the agent can do during a run.

tools give the model functions to call. Pass plain Python functions — they are wrapped into Tools with a schema derived from their type hints and docstring.
output requests structured output. Set it to a Pydantic model and result.output becomes an instance of that model.
output_mode controls how that structure is produced. "auto" (the default) picks tool mode when output is set and text mode otherwise; force "tool" or "text" when you need to.
memory attaches a memory backend so the agent can persist conversation and recall facts across runs.
history attaches a compaction policy that trims the transcript when it grows past the context window.

tools_and_output.py

from pydantic import BaseModel
from agentroute import Agent
 
def get_weather(city: str) -> str:
    """Return the current weather for a city."""
    return f"{city}: 18C, clear"
 
class Forecast(BaseModel):
    city: str
    summary: str
 
agent = Agent(
    name="weather",
    model="claude-sonnet-4",
    tools=[get_weather],
    output=Forecast,
)
 
result = agent.run("What's the weather in Lisbon?")
print(result.output.summary)  # result.output is a Forecast instance

Limits

max_turns, max_cost, and retries are the guardrails that keep a run bounded.

max_turns caps how many times the model is called in one run. The default is 10.
max_cost is an optional USD ceiling. Leave it None for no cost guard.
retries controls how many times a Retry signal can ask the model to try again before the run fails. See Errors and retries for how Retry differs from a hard error.

Limits raise, they do not silently stop

When a run exceeds max_turns it raises ErrorMaxTurns; when accumulated cost crosses max_cost it raises ErrorBudget. Both carry the offending value and the limit (turn/limit and spent/limit). Catch them or set generous ceilings for long-running tasks. Details on Errors and retries.

Running an agent

The agent exposes two entry points. run is synchronous and wraps arun in asyncio.run. Use run from scripts and notebooks; use arun from inside an existing event loop (a web handler, another async task). Both take a prompt and return a Result.

# Synchronous — convenient for scripts.
result = agent.run("Summarize this thread.")
 
# Async — from within an event loop.
result = await agent.arun("Summarize this thread.")

Caution

run calls asyncio.run internally, so it cannot be called from inside a running event loop (for example, inside an async def handler or a Jupyter cell with an active loop). Reach for arun there.

Passing dependencies with `deps`

Both run and arun accept a keyword-only deps argument. Whatever you pass is attached to the run's Context as ctx.deps and made available to every tool that asks for a Context. This is how you inject request-scoped state — a database handle, the current user, a config object — without reaching for globals.

deps.py

from dataclasses import dataclass
from agentroute import Agent, Context
 
@dataclass
class Deps:
    user_id: str
 
agent = Agent(name="profile", model="claude-sonnet-4")
 
@agent.tool
def whoami(ctx: Context) -> str:
    """Return the current user's id."""
    return ctx.deps.user_id
 
result = agent.run("Who am I?", deps=Deps(user_id="u_123"))
print(result)  # u_123

deps is never sent to the model. Like the rest of Context, it stays server-side. See Context and usage for the full picture.

The agentic loop

A single run may involve several round trips to the model. AgentRoute drives a loop: it sends the running transcript plus the tool schemas to the model, executes any tool calls the model returns, appends the results to the transcript, and asks again. The loop ends when the model answers without calling a tool — or when a limit is hit.

Build the context

A fresh Context is created for the run, carrying deps, an empty Usage accumulator, and the agent's memory.

Call the model

The transcript (system instructions, the prompt, and any prior turns) and the JSON schemas for the registered tools are sent to the model. This counts as one turn.

Execute tool calls

If the model asked to call tools, AgentRoute runs them — async tools are awaited, sync tools run in a worker thread — and appends each result to the transcript as a tool-role message.

Repeat or finish

If tools ran, the loop goes back to the model with the new results. If the model returned a final answer instead, the loop stops and produces a Result.

Enforce limits

Every turn checks max_turns and max_cost. Crossing either aborts the loop with ErrorMaxTurns or ErrorBudget.

Token counts and cost accumulate on ctx.usage throughout, and the final tally is returned on result.usage. Read more about how usage is tracked on Context and usage.

Registering tools and validators

Beyond passing tools to the constructor, you can attach tools to an existing agent with the @agent.tool decorator. It accepts the same options as the standalone tool decorator — name, description, needs_approval, and timeout.

agent = Agent(name="math", model="claude-sonnet-4")
 
@agent.tool
def add(a: int, b: int) -> int:
    """Add two integers."""
    return a + b
 
@agent.tool(needs_approval=True)
def delete_record(record_id: str) -> str:
    """Delete a record. Requires approval before running."""
    return f"deleted {record_id}"

When you set an output type, you can register validators with @agent.output_validator. A validator receives the Context and the parsed output, and either returns the (possibly adjusted) output or raises Retry(message) to ask the model to try again. Validator retries are bounded by retries.

validate.py

from pydantic import BaseModel
from agentroute import Agent, Context, Retry
 
class Order(BaseModel):
    sku: str
    quantity: int
 
agent = Agent(name="orders", model="claude-sonnet-4", output=Order)
 
@agent.output_validator
def positive_quantity(ctx: Context, output: Order) -> Order:
    if output.quantity <= 0:
        raise Retry("quantity must be a positive integer")
    return output

Inspecting registered tools and validators

Two read-only properties let you see what is registered without mutating it:

agent.tools_map returns a dict[str, Tool] keyed by tool name.
agent.output_validators returns a list copy of the registered validators.

agent = Agent(name="math", model="claude-sonnet-4")
 
@agent.tool
def add(a: int, b: int) -> int:
    """Add two integers."""
    return a + b
 
print(list(agent.tools_map))        # ['add']
print(len(agent.output_validators)) # 0

Registering a second tool with a name that already exists raises ValueError, so tool names are unique within an agent.

Next steps

Agent reference

The full constructor, methods, and attribute signatures for Agent.

Tools

Define functions the model can call, with auto-generated schemas.

Structured output

Get typed Result.output by setting an output model.

Errors and retries

How max_turns, max_cost, and the Retry signal behave.

Context and usage

Pass deps, track token usage, and inject Context into tools.

Results

What run returns and how to read the output and transcript.