Instrument AI Agents

Learn how to manually instrument your code to use Sentry's Agents module.

With Sentry AI Agent Monitoring, you can monitor and debug your AI systems with full-stack context. You'll be able to track key insights like token usage, latency, tool usage, and error rates. AI Agent Monitoring data will be fully connected to your other Sentry data like logs, errors, and traces.

Sentry Agent Skills

Install Sentry's agent skills to teach your AI coding assistant how to set up AI Agent Monitoring in your Python application.

Copied

npx @sentry/dotagents add getsentry/sentry-agent-skills --name sentry-setup-ai-monitoring

See the full list of available skills and installation docs for more details.

As a prerequisite to setting up AI Agent Monitoring with Python, you'll need to first set up tracing. Once this is done, the Python SDK will automatically instrument AI agents created with supported libraries. If that doesn't fit your use case, you can use custom instrumentation described below.

The Python SDK supports automatic instrumentation for some AI libraries. We recommend adding their integrations to your Sentry configuration to automatically capture spans for AI agents.

For your AI agents data to show up in the Sentry AI Agents Insights, at least one of the AI spans needs to be created and have well-defined names and data attributes. See below.

The @sentry_sdk.trace() decorator can also be used to create these spans.

This span represents a request to a LLM model or service that generates a response based on the input prompt.

AI Request span attributes

The @sentry_sdk.trace() decorator can also be used to create this span.

The span op MUST be "gen_ai.{gen_ai.operation.name}". (e.g. "gen_ai.request")
The span name SHOULD be {gen_ai.operation.name} {gen_ai.request.model}". (e.g. "chat o3-mini")
All Common Span Attributes SHOULD be set (all required common attributes MUST be set).

Additional attributes on the span:

Data Attribute	Type	Requirement Level	Description	Example
`gen_ai.request.available_tools`	string	optional	List of objects describing the available tools. [0]	`"[{\"name\": \"random_number\", \"description\": \"...\"}, {\"name\": \"query_db\", \"description\": \"...\"}]"`
`gen_ai.request.frequency_penalty`	float	optional	Model configuration parameter.	`0.5`
`gen_ai.request.max_tokens`	int	optional	Model configuration parameter.	`500`
`gen_ai.request.messages`	string	optional	List of objects describing the messages (prompts) sent to the LLM [0], [1]	`"[{\"role\": \"system\", \"content\": [{...}]}, {\"role\": \"system\", \"content\": [{...}]}]"`
`gen_ai.request.presence_penalty`	float	optional	Model configuration parameter.	`0.5`
`gen_ai.request.temperature`	float	optional	Model configuration parameter.	`0.1`
`gen_ai.request.top_p`	float	optional	Model configuration parameter.	`0.7`
`gen_ai.response.tool_calls`	string	optional	The tool calls in the model's response. [0]	`"[{\"name\": \"random_number\", \"type\": \"function_call\", \"arguments\": \"...\"}]"`
`gen_ai.response.text`	string	optional	The text representation of the model's responses. [0]	`"[\"The weather in Paris is rainy\", \"The weather in London is sunny\"]"`
`gen_ai.usage.input_tokens.cache_write`	int	optional	The number of tokens written to the cache when processing the AI input (prompt).	`100`
`gen_ai.usage.input_tokens.cached`	int	optional	The number of cached tokens used in the AI input (prompt)	`50`
`gen_ai.usage.input_tokens`	int	optional	The number of tokens used in the AI input (prompt).	`10`
`gen_ai.usage.output_tokens.reasoning`	int	optional	The number of tokens used for reasoning.	`30`
`gen_ai.usage.output_tokens`	int	optional	The number of tokens used in the AI response.	`100`
`gen_ai.usage.total_tokens`	int	optional	The total number of tokens used to process the prompt. (input and output)	`190`

[0]: Span attributes only allow primitive data types. This means you need to use a stringified version of a list of dictionaries. Do NOT set [{"foo": "bar"}] but rather the string "[{\"foo\": \"bar\"}]".
[1]: Each message item uses the format {role:"", content:""}. The role can be "user", "assistant", or "system". The content can be either a string or a list of dictionaries.

Copied

import sentry_sdk

messages = [{"role": "user", "content": "Tell me a joke"}]

with sentry_sdk.start_span(op="gen_ai.request", name="chat o3-mini") as span:
    span.set_data("gen_ai.request.model", "o3-mini")
    span.set_data("gen_ai.request.messages", json.dumps(messages))
    span.set_data("gen_ai.operation.name", "invoke_agent")

    # Call your LLM here
    result = client.chat.completions.create(model="o3-mini", messages=messages)

    span.set_data("gen_ai.response.text", json.dumps([result.choices[0].message.content]))
    # Set token usage
    span.set_data("gen_ai.usage.input_tokens", result.usage.prompt_tokens)
    span.set_data("gen_ai.usage.output_tokens", result.usage.completion_tokens)

This span represents the execution of an AI agent, capturing the full lifecycle from receiving a task to producing a final response.

Invoke Agent span attributes

Describes AI agent invocation.

The @sentry_sdk.trace() decorator can also be used to create this span.

The spans op MUST be "gen_ai.invoke_agent".
The span name SHOULD be "invoke_agent {gen_ai.agent.name}".
The gen_ai.operation.name attribute MUST be "invoke_agent".
The gen_ai.agent.name attribute SHOULD be set to the agent's name. (e.g. "Weather Agent")
All Common Span Attributes SHOULD be set (all required common attributes MUST be set).

Additional attributes on the span:

Data Attribute	Type	Requirement Level	Description	Example
`gen_ai.request.available_tools`	string	optional	List of objects describing the available tools. [0]	`"[{\"name\": \"random_number\", \"description\": \"...\"}, {\"name\": \"query_db\", \"description\": \"...\"}]"`
`gen_ai.request.frequency_penalty`	float	optional	Model configuration parameter.	`0.5`
`gen_ai.request.max_tokens`	int	optional	Model configuration parameter.	`500`
`gen_ai.request.messages`	string	optional	List of objects describing the messages (prompts) sent to the LLM [0], [1]	`"[{\"role\": \"system\", \"content\": [{...}]}, {\"role\": \"system\", \"content\": [{...}]}]"`
`gen_ai.request.presence_penalty`	float	optional	Model configuration parameter.	`0.5`
`gen_ai.request.temperature`	float	optional	Model configuration parameter.	`0.1`
`gen_ai.request.top_p`	float	optional	Model configuration parameter.	`0.7`
`gen_ai.response.tool_calls`	string	optional	The tool calls in the model’s response. [0]	`"[{\"name\": \"random_number\", \"type\": \"function_call\", \"arguments\": \"...\"}]"`
`gen_ai.response.text`	string	optional	The text representation of the model's responses. [0]	`"[\"The weather in Paris is rainy\", \"The weather in London is sunny\"]"`
`gen_ai.usage.input_tokens.cache_write`	int	optional	The number of tokens written to the cache when processing the AI input (prompt).	`100`
`gen_ai.usage.input_tokens.cached`	int	optional	The number of cached tokens used in the AI input (prompt)	`50`
`gen_ai.usage.input_tokens`	int	optional	The number of tokens used in the AI input (prompt).	`10`
`gen_ai.usage.output_tokens.reasoning`	int	optional	The number of tokens used for reasoning.	`30`
`gen_ai.usage.output_tokens`	int	optional	The number of tokens used in the AI response.	`100`
`gen_ai.usage.total_tokens`	int	optional	The total number of tokens used to process the prompt. (input and output)	`190`

[0]: Span attributes only allow primitive data types (like int, float, boolean, string). This means you need to use a stringified version of a list of dictionaries. Do NOT set [{"foo": "bar"}] but rather the string "[{\"foo\": \"bar\"}]".
[1]: Each message item uses the format {role:"", content:""}. The role can be "user", "assistant", or "system". The content can be either a string or a list of dictionaries.

Copied

import sentry_sdk

with sentry_sdk.start_span(op="gen_ai.invoke_agent", name="invoke_agent Weather Agent") as span:
    span.set_data("gen_ai.request.model", "o3-mini")
    span.set_data("gen_ai.agent.name", "Weather Agent")

    # Run the agent
    final_output = my_agent.run()

    span.set_data("gen_ai.response.text", str(final_output))
    # Set token usage
    span.set_data("gen_ai.usage.input_tokens", result.usage.input_tokens)
    span.set_data("gen_ai.usage.output_tokens", result.usage.output_tokens)

This span represents the execution of a tool or function that was requested by an AI model, including the input arguments and resulting output.

Execute Tool span attributes

Describes a tool execution.

The @sentry_sdk.trace() decorator can also be used to create this span.

The span op MUST be "gen_ai.execute_tool".
The span name SHOULD be "execute_tool {gen_ai.tool.name}". (e.g. "execute_tool query_database")
The gen_ai.tool.name attribute SHOULD be set to the name of the tool. (e.g. "query_database")
All Common Span Attributes SHOULD be set (all required common attributes MUST be set).

Additional attributes on the span:

Data Attribute	Type	Requirement Level	Description	Example
`gen_ai.tool.description`	string	optional	Description of the tool executed.	`"Tool returning a random number"`
`gen_ai.tool.input`	string	optional	Input that was given to the executed tool as string.	`"{\"max\":10}"`
`gen_ai.tool.name`	string	optional	Name of the tool executed.	`"random_number"`
`gen_ai.tool.output`	string	optional	The output from the tool.	`"7"`
`gen_ai.tool.type`	string	optional	The type of the tools.	`"function"`; `"extension"`; `"datastore"`

Copied

import sentry_sdk

with sentry_sdk.start_span(op="gen_ai.execute_tool", name="execute_tool get_weather") as span:
    span.set_data("gen_ai.tool.name", "get_weather")
    span.set_data("gen_ai.tool.input", json.dumps({"location": "Paris"}))

    # Call the tool
    result = get_weather(location="Paris")

    span.set_data("gen_ai.tool.output", json.dumps(result))

This span marks the transition of control from one agent to another, typically when the current agent determines another agent is better suited to handle the task.

Handoff span attributes

A span that describes the handoff from one agent to another.

The spans op MUST be "gen_ai.handoff".
The spans name SHOULD be "handoff from {from_agent} to {to_agent}".
All Common Span Attributes SHOULD be set.

Copied

import sentry_sdk

with sentry_sdk.start_span(op="gen_ai.handoff", name="handoff from Weather Agent to Travel Agent"):
    pass  # Handoff span just marks the transition

with sentry_sdk.start_span(op="gen_ai.invoke_agent", name="invoke_agent Travel Agent"):
    # Run the target agent here
    pass

Tracking Conversations has alpha stability. Configuration options and behavior may change.

For AI applications that involve multi-turn conversations, you can use sentry_sdk.ai.set_conversation_id() to associate all AI spans from the same conversation. This enables you to track and analyze complete conversation flows within Sentry.

The conversation ID is set as the gen_ai.conversation.id attribute on all AI-related spans in the current scope. To remove the conversation ID, use the remove_conversation_id() method on the Scope.

Copied

import sentry_sdk.ai

sentry_sdk.ai.set_conversation_id("conv_abc123")

# All subsequent AI calls will be linked to this conversation

Some integrations, like the OpenAI integration, will automatically set the conversation ID for you, when you use APIs that expose that.

Copied

import sentry_sdk
import openai

sentry_sdk.init(...)

conversation = openai.conversations.create()

response = openai.responses.create(
    model="gpt-4.1",
    input=[{"role": "user", "content": "What are the 5 Ds of dodgeball?"}],
    conversation=conversation.id  # this will automatically set `gen_ai.conversation.id` on the span
)

Some attributes are common to all AI Agents spans:

Data Attribute	Type	Requirement Level	Description	Example
`gen_ai.request.model`	string	required	The name of the AI model a request is being made to.	`"o3-mini"`
`gen_ai.operation.name`	string	optional	The name of the operation being performed.	`"summarize"`
`gen_ai.agent.name`	string	optional	The name of the agent this span belongs to.	`"Weather Agent"`

When manually setting token attributes, be aware of how Sentry uses them to calculate model costs.

Cached and reasoning tokens are subsets, not separate counts. gen_ai.usage.input_tokens is the total input token count that already includes any cached tokens. Similarly, gen_ai.usage.output_tokens already includes reasoning tokens. Sentry subtracts the cached/reasoning counts from the totals to compute the "raw" portion, so reporting them incorrectly can produce wrong or negative costs.

For example, say your LLM call uses 100 input tokens total, 90 of which were served from cache. Using a standard rate of $0.01 per token and a cached rate of $0.001 per token:

Correct — input_tokens is the total (includes cached):

gen_ai.usage.input_tokens = 100
gen_ai.usage.input_tokens.cached = 90
Sentry calculates: (100 - 90) × $0.01 + 90 × $0.001 = $0.10 + $0.09 = $0.19 ✓

Wrong — input_tokens set to only the non-cached tokens, making cached larger than total:

gen_ai.usage.input_tokens = 10
gen_ai.usage.input_tokens.cached = 90
Sentry calculates: (10 - 90) × $0.01 + 90 × $0.001 = −$0.80 + $0.09 = −$0.71

Because input_tokens.cached (90) is larger than input_tokens (10), the subtraction goes negative, resulting in a negative total cost.

The same applies to gen_ai.usage.output_tokens and gen_ai.usage.output_tokens.reasoning.

Was this helpful?

Help improve this content
Our documentation is open source and available on GitHub. Your contributions are welcome, whether fixing a typo (drat!) or suggesting an update ("yeah, this would be better").

How to contribute | Edit this page | Create a docs issue | Get support