Context & Usage

Context is the dependency-injection container AgentRoute hands to your tools. It carries the things a tool needs that the model should never see — a database handle, an API client, the current user, the running token tally, a memory handle. Usage is the running tally of tokens, cost, and model calls for a single run; you read it back off result.usage after run().

This page covers both: how to pass deps into a run and read them inside tools, what every Context field holds, and how Usage accumulates.

Context is never sent to the LLM

Context exists entirely on your side of the wire. It is auto-injected into tools at call time and is never serialized into the transcript or sent to the model. Put secrets, clients, and request-scoped state here — they stay in your process.

Dependency injection with deps

Tools often need access to things that live in your application: a database connection, an HTTP client, the authenticated user. Rather than reaching for globals, you pass them once per run as deps, and AgentRoute threads them through to any tool that asks for the Context.

There are two halves to the pattern:

Pass your dependencies to run() (or arun()) with the deps= keyword.
Declare a parameter annotated Context as the first argument of a tool. AgentRoute injects the live Context there, and your tool reads ctx.deps.

A typed deps dataclass keeps this clean and gives you editor completion inside the tool:

weather_agent.py

from dataclasses import dataclass
import httpx
 
from agentroute import Agent, Context
 
@dataclass
class Deps:
    http: httpx.Client
    api_key: str
 
agent = Agent(name="weather", model="claude-sonnet-4")
 
@agent.tool
def get_temperature(ctx: Context, city: str) -> str:
    """Return the current temperature for a city in Celsius."""
    resp = ctx.deps.http.get(
        "https://api.weather.example/v1/now",
        params={"city": city, "key": ctx.deps.api_key},
    )
    return f"{resp.json()['temp_c']}C"
 
with httpx.Client() as client:
    deps = Deps(http=client, api_key="wx_live_...")
    result = agent.run("How warm is it in Lisbon?", deps=deps)
    print(result)
    # 24C right now in Lisbon.

The first positional parameter is annotated Context, so AgentRoute injects it automatically. The model never sees ctx — it only sees the city parameter in the tool schema. See Tools for how the rest of the schema is extracted from your type hints and docstring.

Tip

deps is Any, so it can be anything — a dataclass, a Pydantic model, a dict, a connection pool. A frozen dataclass is the recommended shape: it is explicit, typed, and immutable for the duration of the run.

The Context fields

Every tool that takes a Context receives the same live instance for the duration of a run. Here is what each field holds.

Parameter	Type	Default	Description
`deps`	Any	`None`	Whatever you passed as `deps=` to `run()` / `arun()`. Read it as `ctx.deps`. This is the primary dependency-injection slot.
`usage`	Usage	`Usage()`	Cumulative token counts and cost for the current run. Updated after every model call; also returned on `result.usage`.
`messages`	list[dict]	`[]`	The running transcript for the current run, in OpenAI-compatible message dicts. Reflects everything sent to and returned from the model so far.
`retry`	int	`0`	The current retry attempt for the active step, starting at `0`. Incremented when a `Retry` signal is caught. See Errors & retries.
`step`	int	`0`	The current turn index in the agentic loop, starting at `0`.
`session_id`	str \| None	`None`	An optional identifier for the conversation or session, when one has been supplied.
`memory`	MemoryProto \| None	`None`	A handle to the agent's memory, if one is attached via `Agent(memory=...)`. Lets tools call `ctx.memory.remember(...)` and `ctx.memory.recall(...)`.

The full reference, including imports and the dataclass signature, lives on the Context reference page.

Usage: tokens, cost, and model calls

Usage accumulates the cost of a run. AgentRoute updates ctx.usage after each model call, and the final tally is returned as result.usage.

from agentroute import Usage

The dataclass has four fields, all numeric and additive:

Parameter	Type	Default	Description
`input_tokens`	int	`0`	Total prompt tokens sent to the model across all turns of the run.
`output_tokens`	int	`0`	Total completion tokens returned by the model across all turns.
`total_cost_usd`	float	`0.0`	Accumulated cost of the run in US dollars.
`model_calls`	int	`0`	Number of model calls made — one per turn of the agentic loop.

Reading result.usage after a run

The most common thing you do with Usage is read it back off the Result to log cost or enforce your own budgets:

from agentroute import Agent
 
agent = Agent(name="summarizer", model="claude-sonnet-4")
result = agent.run("Summarize the last quarter's release notes.")
 
usage = result.usage
print(usage.input_tokens, usage.output_tokens)   # 1840 312
print(f"${usage.total_cost_usd:.4f}")            # $0.0091
print(usage.model_calls)                          # 1

A single-shot answer is one model_calls. An agent that calls tools and loops will report one model call per turn. To cap spend during the run itself rather than after, set max_cost on the agent — see Errors & retries for ErrorBudget.

Accumulating usage with add()

Usage has one method, add(), which folds another Usage into this one in place. AgentRoute uses it internally to roll each turn's cost into the running total, but it is also handy when you run several agents and want a combined figure:

add_usage.py

from agentroute import Agent, Usage
 
planner = Agent(name="planner", model="claude-sonnet-4")
writer = Agent(name="writer", model="claude-sonnet-4")
 
total = Usage()
plan = planner.run("Outline a blog post about vector databases.")
total.add(plan.usage)
 
draft = writer.run(f"Write the post from this outline:\n{plan}")
total.add(draft.usage)
 
print(f"Combined cost: ${total.total_cost_usd:.4f}, calls: {total.model_calls}")

add() mutates the receiver and returns None, so accumulate into a long-lived Usage rather than chaining.

Accessing memory from a tool

When the agent has memory attached, ctx.memory exposes the same async surface inside your tools. A tool can persist a fact for later turns or look one up:

memory_tool.py

from agentroute import Agent, Context, Memory
 
agent = Agent(
    name="assistant",
    model="claude-sonnet-4",
    memory=Memory(),
)
 
@agent.tool
async def save_preference(ctx: Context, key: str, value: str) -> str:
    """Remember a user preference for later in the conversation."""
    await ctx.memory.remember(key, value)
    return f"Saved {key}."
 
@agent.tool
async def lookup_preference(ctx: Context, query: str) -> list[str]:
    """Recall previously saved preferences matching a query."""
    return await ctx.memory.recall(query, limit=5)

Because remember() and recall() are async, declare the tool async def and await them directly. Sync tools work too, but you cannot await from a sync function — reach for memory in async tools. The full memory model, including the persistent MemorySQLite backend, is covered in Memory.

ctx.memory may be None

ctx.memory is only populated when the agent was constructed with Agent(memory=...). If no memory is attached it is None, so guard with if ctx.memory is not None: in tools that may run on memory-less agents.

Next steps

Tools

How tools are defined, how schemas are extracted, and how Context is injected.

Memory

In-RAM and SQLite memory backends, and the remember / recall API.

Context reference

The full API reference for Context and Usage, with signatures and source.

Errors & retries

Budgets, turn limits, and the Retry signal that drives ctx.retry.