Models

Reference for resolve_model, the Model protocol, and the Message, LLMRequest, and LLMResponse message types.


The models layer turns a model string like "claude-sonnet-4" into a concrete provider client. Most of the time you never touch it directly — you pass a string to Agent(model=...) and the agent calls resolve_model for you. Reach into this module when you want to build a provider yourself, inspect the transcript types, or point an agent at a custom endpoint.

For the conceptual overview of model strings, routing, and API keys, see Models.

resolve_model

resolve_model resolves a model string to a concrete Model.

from agentroute import resolve_model
def resolve_model(
    model: str,
    *,
    api_key: str | None = None,
    base_url: str | None = None,
) -> Model
ParameterTypeDefaultDescription
modelrequiredstrThe model string to resolve. Either a routing string (see resolution rules below) or a bare name like claude-sonnet-4.
api_keystr | NoneNoneExplicit API key. Takes priority over every other source. When omitted, the key is looked up from Config (env vars, then ~/.agentroute/config).
base_urlstr | NoneNoneOverride the resolved base URL. Useful for proxies or self-hosted gateways while keeping the model id and key resolution intact.
ReturnsModel
A ready-to-use provider client implementing the Model protocol.

Resolution rules

resolve_model inspects the model string in order and picks the first rule that matches:

  1. Starts with http:// or https:// — a custom OpenAI-compatible endpoint. The string is split on the last /: everything before is the base URL, the final segment is the model id.
  2. Starts with ollama/ — routed to http://localhost:11434/v1 with no real auth (a placeholder "ollama" key is used).
  3. Otherwise — routed to OpenRouter at https://openrouter.ai/api/v1. One OpenRouter key works for every model.

For an OpenRouter string with no /, a vendor prefix is inferred from the name so claude-sonnet-4 becomes anthropic/claude-sonnet-4:

Name prefixInferred vendor
claudeanthropic
gpt, o1, o3, o4openai
geminigoogle
llamameta-llama
deepseekdeepseek
mistralmistralai

If the name already contains a / (for example anthropic/claude-sonnet-4), it is passed through to OpenRouter unchanged.

API-key resolution order

The key is chosen from the first available source: the explicit api_key= argument, then Config().api_key(kind=...) which reads env vars (AGENTROUTE_API_KEY or OPENROUTER_API_KEY) and then ~/.agentroute/config, then None for no-auth endpoints like Ollama.

Model

Model is the minimal async provider protocol. It is runtime_checkable, so any object with these three methods satisfies isinstance(obj, Model). Implement it to plug in a provider the resolver does not cover.

from agentroute import Model
@runtime_checkable
class Model(Protocol):
    async def complete(self, request: LLMRequest) -> LLMResponse: ...
    def complete_stream(self, request: LLMRequest) -> AsyncIterator[LLMChunk]: ...
    async def close(self) -> None: ...

complete

async def complete(self, request: LLMRequest) -> LLMResponse

Send one request and await the full response. This is the method the agentic loop calls each turn.

ParameterTypeDefaultDescription
requestrequiredLLMRequestThe request to send: model id, messages, optional tool schemas, and sampling params.
The model's reply: text content, any tool calls, and usage.

complete_stream

def complete_stream(self, request: LLMRequest) -> AsyncIterator[LLMChunk]

Stream the response incrementally as LLMChunk deltas. Each chunk carries a text delta, a tool_call_delta, or a done flag.

ParameterTypeDefaultDescription
requestrequiredLLMRequestThe request to stream.
ReturnsAsyncIterator[LLMChunk]
An async iterator of incremental chunks.
Streaming is forthcoming

A high-level agent.stream() API is wired in a later phase. The provider-level complete_stream method exists today, but streaming through the agent is not yet available.

close

async def close(self) -> None

Release any underlying HTTP client or connection pool. Call it when you are done with a provider you constructed by hand.

Message types

These dataclasses make up the OpenAI-compatible transcript that flows between the agent and a provider. Result.messages is a list of these serialized to dicts.

Message

Message is a single conversation message.

from agentroute import Message
@dataclass
class Message:
    role: str
    content: str | None = None
    tool_calls: list[dict] | None = None
    tool_call_id: str | None = None
    name: str | None = None
ParameterTypeDefaultDescription
rolerequiredstrOne of "system", "user", "assistant", or "tool".
contentstr | NoneNoneThe text body. None on an assistant message that only contains tool calls.
tool_callslist[dict] | NoneNoneTool calls requested by an assistant message, in OpenAI tool-call shape.
tool_call_idstr | NoneNoneOn a "tool" message, the id of the tool call this is a result for.
namestr | NoneNoneOptional name, e.g. the tool name on a tool-result message.

LLMRequest

LLMRequest is one call to a provider.

from agentroute import LLMRequest
@dataclass
class LLMRequest:
    model: str
    messages: list[Message]
    tools: list[dict] | None = None
    temperature: float | None = None
    max_tokens: int | None = None
    extra: dict = field(default_factory=dict)
ParameterTypeDefaultDescription
modelrequiredstrThe resolved model id to call.
messagesrequiredlist[Message]The conversation transcript to send.
toolslist[dict] | NoneNoneTool schemas exposed to the model this turn, in OpenAI tools shape.
temperaturefloat | NoneNoneSampling temperature. None leaves the provider default.
max_tokensint | NoneNoneMaximum tokens to generate. None leaves the provider default.
extradict{}Provider-specific passthrough parameters.

LLMResponse

LLMResponse is one reply from a provider.

from agentroute import LLMResponse
@dataclass
class LLMResponse:
    content: str | None
    tool_calls: list[dict] = field(default_factory=list)
    usage: dict = field(default_factory=dict)
    model: str | None = None
ParameterTypeDefaultDescription
contentrequiredstr | NoneThe text reply. None when the model returned only tool calls.
tool_callslist[dict][]Tool calls the model wants executed, in OpenAI tool-call shape.
usagedict{}Raw token/cost usage reported by the provider.
modelstr | NoneNoneThe model id the provider actually served.

Examples

Resolve an OpenRouter model from a bare name. The vendor prefix is inferred and the key comes from AGENTROUTE_API_KEY.

from agentroute import resolve_model
 
# claude-sonnet-4 -> anthropic/claude-sonnet-4 on OpenRouter
model = resolve_model("claude-sonnet-4")

Resolve a local Ollama model. No key is required.

from agentroute import resolve_model
 
# routed to http://localhost:11434/v1
model = resolve_model("ollama/llama3.1")

Resolve a custom OpenAI-compatible endpoint. The last path segment is the model id.

from agentroute import resolve_model
 
# base_url -> https://gateway.internal/v1, model_id -> my-model
model = resolve_model(
    "https://gateway.internal/v1/my-model",
    api_key="sk-internal-...",
)

In practice you rarely call resolve_model yourself — Agent does it from the model, api_key, and base_url you pass.

from agentroute import Agent
 
agent = Agent(name="assistant", model="claude-sonnet-4")
result = agent.run("What's the capital of France?")
print(result)  # Paris

Source

models/__init__.py