Tracing Amazon Bedrock with MLflow

MLflow supports automatic tracing for Amazon Bedrock, a fully managed service on AWS that provides high-performing foundations from leading AI provides such as Anthropic, Cohere, Meta, Mistral AI, and more. By enabling auto tracing for Amazon Bedrock by calling the mlflow.bedrock.autolog() function, MLflow will capture traces for LLM invocation and logged them to the active MLflow Experiment.

import mlflow

mlflow.bedrock.autolog()

MLflow trace automatically captures the following information about Amazon Bedrock calls:

Prompts and completion responses
Latencies
Model name
Additional metadata such as temperature, max_tokens, if specified.
Function calling if returned in the response
Any exception if raised

Supported APIs

MLflow supports automatic tracing for the following Amazon Bedrock APIs:

Basic Example

import boto3
import mlflow

# Enable auto-tracing for Amazon Bedrock
mlflow.bedrock.autolog()
mlflow.set_experiment("Bedrock")

# Create a boto3 client for invoking the Bedrock API
bedrock = boto3.client(
    service_name="bedrock-runtime",
    region_name="<REPLACE_WITH_YOUR_AWS_REGION>",
)

# MLflow will log a trace for Bedrock API call in the "Bedrock" experiment created above
response = bedrock.converse(
    modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[
        {
            "role": "user",
            "content": "Describe the purpose of a 'hello world' program in one line.",
        }
    ],
    inferenceConfig={
        "maxTokens": 512,
        "temperature": 0.1,
        "topP": 0.9,
    },
)

The logged trace, associated with the Bedrock experiment, can be seen in the MLflow UI.

Raw Inputs and Outputs

By default, MLflow renders the rich chat-like UI for input and output messages in the Chat tab. To view the raw input and output payload, including configuration parameters, click on the Inputs / Outputs tab in the UI.

Note

The Chat panel is only supported for the converse and converse_stream APIs. For the other APIs, MLflow only displays the Inputs / Outputs tab.

Streaming

MLflow supports tracing streaming calls to Amazon Bedrock APIs. The generated trace shows the aggregated output message in the Chat tab, while the individual chunks are displayed in the Events tab.

response = bedrock.converse_stream(
    modelId="anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[
        {
            "role": "user",
            "content": [
                {"text": "Describe the purpose of a 'hello world' program in one line."}
            ],
        }
    ],
    inferenceConfig={
        "maxTokens": 300,
        "temperature": 0.1,
        "topP": 0.9,
    },
)

for chunk in response["stream"]:
    print(chunk)

Attention

MLflow does not create a span immediately when the streaming response is returned. Instead, it creates a span when the streaming chunks are consumed, for example, the for-loop in the code snippet above.

Function Calling Agent

MLflow Tracing automatically captures function calling metadata when calling Amazon Bedrock APIs. The function definition and instruction in the response will be highlighted in the Chat tab on trace UI.

Combining this with the manual tracing feature, you can define a function-calling agent (ReAct) and trace its execution. The entire agent implementation might look complicated, but the tracing part is pretty straightforward: (1) add the @mlflow.trace decorator to functions to trace and (2) enable auto-tracing for Amazon Bedrock with mlflow.bedrock.autolog(). MLflow will take care of the complexity such as resolving call chains and recording execution metadata.

import boto3
import mlflow
from mlflow.entities import SpanType

# Enable auto-tracing for Amazon Bedrock
mlflow.bedrock.autolog()
mlflow.set_experiment("Bedrock")

# Create a boto3 client for invoking the Bedrock API
bedrock = boto3.client(
    service_name="bedrock-runtime",
    region_name="<REPLACE_WITH_YOUR_AWS_REGION>",
)
model_id = "anthropic.claude-3-5-sonnet-20241022-v2:0"


# Define the tool function. Decorate it with `@mlflow.trace` to create a span for its execution.
@mlflow.trace(span_type=SpanType.TOOL)
def get_weather(city: str) -> str:
    """ "Get the current weather in a given location"""
    return "sunny" if city == "San Francisco, CA" else "unknown"


# Define the tool configuration passed to Bedrock
tools = [
    {
        "toolSpec": {
            "name": "get_weather",
            "description": "Get the current weather in a given location",
            "inputSchema": {
                "json": {
                    "type": "object",
                    "properties": {
                        "city": {
                            "type": "string",
                            "description": "The city and state, e.g., San Francisco, CA",
                        },
                    },
                    "required": ["city"],
                }
            },
        }
    }
]
tool_functions = {"get_weather": get_weather}


# Define a simple tool calling agent
@mlflow.trace(span_type=SpanType.AGENT)
def run_tool_agent(question: str) -> str:
    messages = [{"role": "user", "content": [{"text": question}]}]

    # Invoke the model with the given question and available tools
    response = bedrock.converse(
        modelId=model_id,
        messages=messages,
        toolConfig={"tools": tools},
    )
    assistant_message = response["output"]["message"]
    messages.append(assistant_message)

    # If the model requests tool call(s), invoke the function with the specified arguments
    tool_use = next(
        (c["toolUse"] for c in assistant_message["content"] if "toolUse" in c), None
    )
    if tool_use:
        tool_func = tool_functions[tool_use["name"]]
        tool_result = tool_func(**tool_use["input"])
        messages.append(
            {
                "role": "user",
                "content": [
                    {
                        "toolResult": {
                            "toolUseId": tool_use["toolUseId"],
                            "content": [{"text": tool_result}],
                        }
                    }
                ],
            }
        )

        # Send the tool results to the model and get a new response
        response = bedrock.converse(
            modelId=model_id,
            messages=messages,
            toolConfig={"tools": tools},
        )

    return response["output"]["message"]["content"][0]["text"]


# Run the tool calling agent
question = "What's the weather like in San Francisco today?"
answer = run_tool_agent(question)

Executing the code above will create a single trace that involves all LLM invocations and the tool calls.

Disable auto-tracing

Auto tracing for Amazon Bedrock can be disabled globally by calling mlflow.bedrock.autolog(disable=True) or mlflow.autolog(disable=True).