Models
Reference for resolve_model, the Model protocol, and the Message, LLMRequest, and LLMResponse message types.
The models layer turns a model string like "claude-sonnet-4" into a concrete provider client. Most of the time you never touch it directly — you pass a string to Agent(model=...) and the agent calls resolve_model for you. Reach into this module when you want to build a provider yourself, inspect the transcript types, or point an agent at a custom endpoint.
For the conceptual overview of model strings, routing, and API keys, see Models.
resolve_model
resolve_model resolves a model string to a concrete Model.
from agentroute import resolve_modeldef resolve_model(
model: str,
*,
api_key: str | None = None,
base_url: str | None = None,
) -> Model| Parameter | Type | Default | Description |
|---|---|---|---|
modelrequired | str | — | The model string to resolve. Either a routing string (see resolution rules below) or a bare name like claude-sonnet-4. |
api_key | str | None | None | Explicit API key. Takes priority over every other source. When omitted, the key is looked up from Config (env vars, then ~/.agentroute/config). |
base_url | str | None | None | Override the resolved base URL. Useful for proxies or self-hosted gateways while keeping the model id and key resolution intact. |
Model protocol.Resolution rules
resolve_model inspects the model string in order and picks the first rule that matches:
- Starts with
http://orhttps://— a custom OpenAI-compatible endpoint. The string is split on the last/: everything before is the base URL, the final segment is the model id. - Starts with
ollama/— routed tohttp://localhost:11434/v1with no real auth (a placeholder"ollama"key is used). - Otherwise — routed to OpenRouter at
https://openrouter.ai/api/v1. One OpenRouter key works for every model.
For an OpenRouter string with no /, a vendor prefix is inferred from the name so claude-sonnet-4 becomes anthropic/claude-sonnet-4:
| Name prefix | Inferred vendor |
|---|---|
claude | anthropic |
gpt, o1, o3, o4 | openai |
gemini | google |
llama | meta-llama |
deepseek | deepseek |
mistral | mistralai |
If the name already contains a / (for example anthropic/claude-sonnet-4), it is passed through to OpenRouter unchanged.
The key is chosen from the first available source: the explicit api_key= argument, then Config().api_key(kind=...) which reads env vars (AGENTROUTE_API_KEY or OPENROUTER_API_KEY) and then ~/.agentroute/config, then None for no-auth endpoints like Ollama.
Model
Model is the minimal async provider protocol. It is runtime_checkable, so any object with these three methods satisfies isinstance(obj, Model). Implement it to plug in a provider the resolver does not cover.
from agentroute import Model@runtime_checkable
class Model(Protocol):
async def complete(self, request: LLMRequest) -> LLMResponse: ...
def complete_stream(self, request: LLMRequest) -> AsyncIterator[LLMChunk]: ...
async def close(self) -> None: ...complete
async def complete(self, request: LLMRequest) -> LLMResponseSend one request and await the full response. This is the method the agentic loop calls each turn.
| Parameter | Type | Default | Description |
|---|---|---|---|
requestrequired | LLMRequest | — | The request to send: model id, messages, optional tool schemas, and sampling params. |
complete_stream
def complete_stream(self, request: LLMRequest) -> AsyncIterator[LLMChunk]Stream the response incrementally as LLMChunk deltas. Each chunk carries a text delta, a tool_call_delta, or a done flag.
| Parameter | Type | Default | Description |
|---|---|---|---|
requestrequired | LLMRequest | — | The request to stream. |
A high-level agent.stream() API is wired in a later phase. The provider-level complete_stream method exists today, but streaming through the agent is not yet available.
close
async def close(self) -> NoneRelease any underlying HTTP client or connection pool. Call it when you are done with a provider you constructed by hand.
Message types
These dataclasses make up the OpenAI-compatible transcript that flows between the agent and a provider. Result.messages is a list of these serialized to dicts.
Message
Message is a single conversation message.
from agentroute import Message@dataclass
class Message:
role: str
content: str | None = None
tool_calls: list[dict] | None = None
tool_call_id: str | None = None
name: str | None = None| Parameter | Type | Default | Description |
|---|---|---|---|
rolerequired | str | — | One of "system", "user", "assistant", or "tool". |
content | str | None | None | The text body. None on an assistant message that only contains tool calls. |
tool_calls | list[dict] | None | None | Tool calls requested by an assistant message, in OpenAI tool-call shape. |
tool_call_id | str | None | None | On a "tool" message, the id of the tool call this is a result for. |
name | str | None | None | Optional name, e.g. the tool name on a tool-result message. |
LLMRequest
LLMRequest is one call to a provider.
from agentroute import LLMRequest@dataclass
class LLMRequest:
model: str
messages: list[Message]
tools: list[dict] | None = None
temperature: float | None = None
max_tokens: int | None = None
extra: dict = field(default_factory=dict)| Parameter | Type | Default | Description |
|---|---|---|---|
modelrequired | str | — | The resolved model id to call. |
messagesrequired | list[Message] | — | The conversation transcript to send. |
tools | list[dict] | None | None | Tool schemas exposed to the model this turn, in OpenAI tools shape. |
temperature | float | None | None | Sampling temperature. None leaves the provider default. |
max_tokens | int | None | None | Maximum tokens to generate. None leaves the provider default. |
extra | dict | {} | Provider-specific passthrough parameters. |
LLMResponse
LLMResponse is one reply from a provider.
from agentroute import LLMResponse@dataclass
class LLMResponse:
content: str | None
tool_calls: list[dict] = field(default_factory=list)
usage: dict = field(default_factory=dict)
model: str | None = None| Parameter | Type | Default | Description |
|---|---|---|---|
contentrequired | str | None | — | The text reply. None when the model returned only tool calls. |
tool_calls | list[dict] | [] | Tool calls the model wants executed, in OpenAI tool-call shape. |
usage | dict | {} | Raw token/cost usage reported by the provider. |
model | str | None | None | The model id the provider actually served. |
Examples
Resolve an OpenRouter model from a bare name. The vendor prefix is inferred and the key comes from AGENTROUTE_API_KEY.
from agentroute import resolve_model
# claude-sonnet-4 -> anthropic/claude-sonnet-4 on OpenRouter
model = resolve_model("claude-sonnet-4")Resolve a local Ollama model. No key is required.
from agentroute import resolve_model
# routed to http://localhost:11434/v1
model = resolve_model("ollama/llama3.1")Resolve a custom OpenAI-compatible endpoint. The last path segment is the model id.
from agentroute import resolve_model
# base_url -> https://gateway.internal/v1, model_id -> my-model
model = resolve_model(
"https://gateway.internal/v1/my-model",
api_key="sk-internal-...",
)In practice you rarely call resolve_model yourself — Agent does it from the model, api_key, and base_url you pass.
from agentroute import Agent
agent = Agent(name="assistant", model="claude-sonnet-4")
result = agent.run("What's the capital of France?")
print(result) # Paris