Document Q&A
Answer questions and summarize over a document you pass in the prompt, returning a typed answer with confidence and grounded source passages.
This example builds an agent that answers questions about a single document you supply at call time. Instead of free-form prose, it returns a Pydantic model, so every answer carries a confidence score, the exact passages it relied on, and a flag for whether the document actually contains an answer at all.
It demonstrates structured output with output=AnswerModel, reading result.output as a typed instance, and a second variant that reuses the same pattern for summarization.
How it works
You pass the document text directly in the prompt and ask your question alongside it. The model reads both and fills in an Answer model. Because is_answerable and source_passages are part of the schema, the model is forced to ground its response in the document rather than guessing from prior knowledge.
Asking the model to fill source_passages and is_answerable as part of the schema nudges it to quote the document and to admit when the answer isn't there. That is far more reliable than appending "say I don't know if it's not in the text" to the instructions.
The agent
from agentroute import Agent
from pydantic import BaseModel, Field
class Answer(BaseModel):
answer: str = Field(description="The answer to the question, grounded in the document.")
confidence: float = Field(description="Confidence from 0.0 to 1.0.")
source_passages: list[str] = Field(
default_factory=list,
description="Verbatim passages from the document that support the answer.",
)
is_answerable: bool = Field(
description="True only if the document actually contains the answer.",
)
qa_agent = Agent(
name="doc-qa",
model="claude-sonnet-4",
instructions=(
"You answer questions strictly from the provided document. "
"Quote the exact passages you used in source_passages. "
"If the document does not contain the answer, set is_answerable to false "
"and lower your confidence accordingly."
),
output=Answer,
)
DOCUMENT = """
AgentRoute bills per model call at the underlying provider's token rate plus a
flat 5% routing fee. There is no monthly minimum and no seat-based pricing.
Usage is metered in real time and invoiced at the end of each calendar month.
Volume discounts begin at 50 million tokens per month.
"""
def ask(question: str) -> Answer:
prompt = f"Document:\n{DOCUMENT}\n\nQuestion: {question}"
result = qa_agent.run(prompt)
return result.output
if __name__ == "__main__":
a = ask("How is AgentRoute priced?")
print(f"Answerable: {a.is_answerable} (confidence {a.confidence:.2f})")
print(a.answer)
for passage in a.source_passages:
print(f" - {passage.strip()}")Because the agent was created with output=Answer, result.output is an Answer instance — not a string. You get full type-checking and attribute access on the parsed fields. Calling str(result) would still give you the raw output text if you need it.
Run it
export AGENTROUTE_API_KEY="sk-or-..."
python doc_qa.pyAnswerable: True (confidence 0.94)
AgentRoute charges per model call at the provider's token rate plus a flat 5% routing fee, with no monthly minimum or seat pricing.
- AgentRoute bills per model call at the underlying provider's token rate plus a flat 5% routing fee.
- There is no monthly minimum and no seat-based pricing.Ask something the document does not cover and the model sets is_answerable to false:
a = ask("What is AgentRoute's uptime SLA?")
assert a.is_answerable is False # the document never mentions an SLAA single OpenRouter key (AGENTROUTE_API_KEY or OPENROUTER_API_KEY) works for every model string. Swap model="claude-sonnet-4" for any other supported model without changing the rest of the code. See models for the resolution rules.
Summarize variant
The same structured-output pattern works for summarization — define a model for the shape you want, then read result.output. Here the agent produces a short summary plus a bulleted list of key points.
from agentroute import Agent
from pydantic import BaseModel, Field
class Summary(BaseModel):
summary: str = Field(description="A two-sentence summary of the document.")
key_points: list[str] = Field(description="The most important points, as short bullets.")
summarizer = Agent(
name="doc-summarizer",
model="claude-sonnet-4",
instructions="Summarize the document faithfully. Do not add facts that are not present.",
output=Summary,
)
def summarize(document: str) -> Summary:
result = summarizer.run(f"Summarize this document:\n\n{document}")
return result.output
if __name__ == "__main__":
s = summarize(DOCUMENT)
print(s.summary)
for point in s.key_points:
print(f" - {point}")For very long documents, see history for compaction strategies, or split the source into chunks and run the summarizer over each chunk with asyncio.gather on summarizer.arun(...).