Context & Usage
How AgentRoute injects dependencies into tools with Context, and how it tracks tokens and cost with Usage across a run.
Context is the dependency-injection container AgentRoute hands to your tools. It carries the things a tool needs that the model should never see — a database handle, an API client, the current user, the running token tally, a memory handle. Usage is the running tally of tokens, cost, and model calls for a single run; you read it back off result.usage after run().
This page covers both: how to pass deps into a run and read them inside tools, what every Context field holds, and how Usage accumulates.
Context exists entirely on your side of the wire. It is auto-injected into tools at call time and is never serialized into the transcript or sent to the model. Put secrets, clients, and request-scoped state here — they stay in your process.
Dependency injection with deps
Tools often need access to things that live in your application: a database connection, an HTTP client, the authenticated user. Rather than reaching for globals, you pass them once per run as deps, and AgentRoute threads them through to any tool that asks for the Context.
There are two halves to the pattern:
- Pass your dependencies to
run()(orarun()) with thedeps=keyword. - Declare a parameter annotated
Contextas the first argument of a tool. AgentRoute injects the liveContextthere, and your tool readsctx.deps.
A typed deps dataclass keeps this clean and gives you editor completion inside the tool:
from dataclasses import dataclass
import httpx
from agentroute import Agent, Context
@dataclass
class Deps:
http: httpx.Client
api_key: str
agent = Agent(name="weather", model="claude-sonnet-4")
@agent.tool
def get_temperature(ctx: Context, city: str) -> str:
"""Return the current temperature for a city in Celsius."""
resp = ctx.deps.http.get(
"https://api.weather.example/v1/now",
params={"city": city, "key": ctx.deps.api_key},
)
return f"{resp.json()['temp_c']}C"
with httpx.Client() as client:
deps = Deps(http=client, api_key="wx_live_...")
result = agent.run("How warm is it in Lisbon?", deps=deps)
print(result)
# 24C right now in Lisbon.The first positional parameter is annotated Context, so AgentRoute injects it automatically. The model never sees ctx — it only sees the city parameter in the tool schema. See Tools for how the rest of the schema is extracted from your type hints and docstring.
deps is Any, so it can be anything — a dataclass, a Pydantic model, a dict, a connection pool. A frozen dataclass is the recommended shape: it is explicit, typed, and immutable for the duration of the run.
The Context fields
Every tool that takes a Context receives the same live instance for the duration of a run. Here is what each field holds.
| Parameter | Type | Default | Description |
|---|---|---|---|
deps | Any | None | Whatever you passed as deps= to run() / arun(). Read it as ctx.deps. This is the primary dependency-injection slot. |
usage | Usage | Usage() | Cumulative token counts and cost for the current run. Updated after every model call; also returned on result.usage. |
messages | list[dict] | [] | The running transcript for the current run, in OpenAI-compatible message dicts. Reflects everything sent to and returned from the model so far. |
retry | int | 0 | The current retry attempt for the active step, starting at 0. Incremented when a Retry signal is caught. See Errors & retries. |
step | int | 0 | The current turn index in the agentic loop, starting at 0. |
session_id | str | None | None | An optional identifier for the conversation or session, when one has been supplied. |
memory | MemoryProto | None | None | A handle to the agent's memory, if one is attached via Agent(memory=...). Lets tools call ctx.memory.remember(...) and ctx.memory.recall(...). |
The full reference, including imports and the dataclass signature, lives on the Context reference page.
Usage: tokens, cost, and model calls
Usage accumulates the cost of a run. AgentRoute updates ctx.usage after each model call, and the final tally is returned as result.usage.
from agentroute import UsageThe dataclass has four fields, all numeric and additive:
| Parameter | Type | Default | Description |
|---|---|---|---|
input_tokens | int | 0 | Total prompt tokens sent to the model across all turns of the run. |
output_tokens | int | 0 | Total completion tokens returned by the model across all turns. |
total_cost_usd | float | 0.0 | Accumulated cost of the run in US dollars. |
model_calls | int | 0 | Number of model calls made — one per turn of the agentic loop. |
Reading result.usage after a run
The most common thing you do with Usage is read it back off the Result to log cost or enforce your own budgets:
from agentroute import Agent
agent = Agent(name="summarizer", model="claude-sonnet-4")
result = agent.run("Summarize the last quarter's release notes.")
usage = result.usage
print(usage.input_tokens, usage.output_tokens) # 1840 312
print(f"${usage.total_cost_usd:.4f}") # $0.0091
print(usage.model_calls) # 1A single-shot answer is one model_calls. An agent that calls tools and loops will report one model call per turn. To cap spend during the run itself rather than after, set max_cost on the agent — see Errors & retries for ErrorBudget.
Accumulating usage with add()
Usage has one method, add(), which folds another Usage into this one in place. AgentRoute uses it internally to roll each turn's cost into the running total, but it is also handy when you run several agents and want a combined figure:
from agentroute import Agent, Usage
planner = Agent(name="planner", model="claude-sonnet-4")
writer = Agent(name="writer", model="claude-sonnet-4")
total = Usage()
plan = planner.run("Outline a blog post about vector databases.")
total.add(plan.usage)
draft = writer.run(f"Write the post from this outline:\n{plan}")
total.add(draft.usage)
print(f"Combined cost: ${total.total_cost_usd:.4f}, calls: {total.model_calls}")add() mutates the receiver and returns None, so accumulate into a long-lived Usage rather than chaining.
Accessing memory from a tool
When the agent has memory attached, ctx.memory exposes the same async surface inside your tools. A tool can persist a fact for later turns or look one up:
from agentroute import Agent, Context, Memory
agent = Agent(
name="assistant",
model="claude-sonnet-4",
memory=Memory(),
)
@agent.tool
async def save_preference(ctx: Context, key: str, value: str) -> str:
"""Remember a user preference for later in the conversation."""
await ctx.memory.remember(key, value)
return f"Saved {key}."
@agent.tool
async def lookup_preference(ctx: Context, query: str) -> list[str]:
"""Recall previously saved preferences matching a query."""
return await ctx.memory.recall(query, limit=5)Because remember() and recall() are async, declare the tool async def and await them directly. Sync tools work too, but you cannot await from a sync function — reach for memory in async tools. The full memory model, including the persistent MemorySQLite backend, is covered in Memory.
ctx.memory is only populated when the agent was constructed with Agent(memory=...). If no memory is attached it is None, so guard with if ctx.memory is not None: in tools that may run on memory-less agents.
Next steps
How tools are defined, how schemas are extracted, and how Context is injected.
In-RAM and SQLite memory backends, and the remember / recall API.
The full API reference for Context and Usage, with signatures and source.
Budgets, turn limits, and the Retry signal that drives ctx.retry.