Skip to main content

Tracing Strands Agents SDK

Agno Tracing via autolog

Strands Agents SDK is an open‑source, model‑driven SDK developed by AWS that enables developers to create autonomous AI agents simply by defining a model, a set of tools, and a prompt in just a few lines of code.

MLflow Tracing provides automatic tracing capability for Strands Agents SDK. By enabling auto tracing for Strands Agents SDK by calling the mlflow.strands.autolog() function, MLflow will capture traces for Agent invocation and log them to the active MLflow Experiment.

import mlflow

mlflow.strands.autolog()

MLflow trace automatically captures the following information about Agentic calls:

  • Prompts and completion responses
  • Latencies
  • Metadata about the different Agents, such as function names
  • Token usages and cost
  • Cache hit
  • Any exception if raised

Basic Example

import mlflow

mlflow.strands.autolog()
mlflow.set_experiment("Strand Agent")

from strands import Agent
from strands.models.openai import OpenAIModel
from strands_tools import calculator

model = OpenAIModel(
client_args={"api_key": "<api-key>"},
# **model_config
model_id="gpt-4o",
params={
"max_tokens": 2000,
"temperature": 0.7,
},
)

agent = Agent(model=model, tools=[calculator])
response = agent("What is 2+2")
print(response)

Strands Agent SDK Tracing via autolog

Token usage

MLflow >= 3.4.0 supports token usage tracking for Strand Agent SDK. The token usage for each Agent call will be logged in the mlflow.chat.tokenUsage attribute. The total token usage throughout the trace will be available in the token_usage field of the trace info object.

response = agent("What is 2+2")
print(response)

last_trace_id = mlflow.get_last_active_trace_id()
trace = mlflow.get_trace(trace_id=last_trace_id)

# Print the token usage
total_usage = trace.info.token_usage
print("== Total token usage: ==")
print(f" Input tokens: {total_usage['input_tokens']}")
print(f" Output tokens: {total_usage['output_tokens']}")
print(f" Total tokens: {total_usage['total_tokens']}")

# Print the token usage for each LLM call
print("\n== Detailed usage for each LLM call: ==")
for span in trace.data.spans:
if usage := span.get_attribute("mlflow.chat.tokenUsage"):
print(f"{span.name}:")
print(f" Input tokens: {usage['input_tokens']}")
print(f" Output tokens: {usage['output_tokens']}")
print(f" Total tokens: {usage['total_tokens']}")
== Total token usage: ==
Input tokens: 5258
Output tokens: 62
Total tokens: 5320

== Detailed usage for each LLM call: ==
invoke_agent Strands Agents:
Input tokens: 2629
Output tokens: 31
Total tokens: 2660
chat_1:
Input tokens: 1301
Output tokens: 16
Total tokens: 1317
chat_2:
Input tokens: 1328
Output tokens: 15
Total tokens: 1343

Disable auto-tracing

Auto tracing for Strands Agent SDK can be disabled globally by calling mlflow.strands.autolog(disable=True) or mlflow.autolog(disable=True).