Skip to main content

Practical AI Observability: Getting Started with MLflow Tracing

· 7 min read
Daniel Liden

MLflow Tracing: Observability for GenAI

GenAI providers and frameworks often respond with complicated and hard-to-read data structures or with simple responses that hide intermediate steps. Furthermore, it can be hard to keep track of and compare GenAI model/framework calls over time, especially if you are moving between frameworks and scripts.

MLflow's LLM tracing solves these issues by recording all of your GenAI calls, including both individual LLM calls and multi-step agentic workflows, and providing an easy-to-read interface for browsing and comparing them. You can enable this functionality for most GenAI providers with a single line of code: mlflow.<provider>.autolog().

This blog will show how to get started with MLflow tracing—in about five minutes. It assumes some familiarity with GenAI APIs (e.g. the OpenAI API), but does not assume any prior familiarity with MLflow.

Alt text
Tracing a LangChain Application with MLflow Tracing

Quickstart

We'll start by showing how to use MLflow autologging to automatically trace calls to OpenAI models, though MLflow supports automatic tracing for an ever-growing number of providers and frameworks including Anthropic, Ollama, Langchain, LlamaIndex, and may others. To get started, install the MLflow and OpenAI Python packages with:

pip install mlflow openai

Collecting Traces with Autologging

In a Python script or notebook, import MLflow and the GenAI provider you're working with, and enable tracing with mlflow.<provider>.autolog. Here's how to set up automatic tracing for OpenAI:

import mlflow
from openai import OpenAI

mlflow.openai.autolog()

Make sure to create and set your OpenAI API key! You can set it in your environment with

export OPENAI_API_KEY="your_api_key_here"

Now, when you use the OpenAI library, MLflow will capture traces of your model calls. For example, MLflow will log a trace of the following OpenAI invocation because we have enabled autologging.

client = OpenAI()

completion = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{
"role": "user",
"content": "What is an MLflow tracking server?"
}
]
)

Viewing your LLM Traces

The MLflow UI provides an AI observability dashboard for viewing your traces. Start the MLflow UI from your terminal with:

mlflow ui

Navigate to the UI. the output of the mlflow ui command will tell you where to go (http://localhost:5000 by default). In the UI, navigate to the "Traces" tab. This will list all of the collected traces. Click on a trace's Trace ID to open up a new pane with more details.

Traces in the MLflow UI

tip

By default, the MLflow server will listen on http://localhost:5000. You can choose a different port with mlflow ui -p <port>. For example, to listen on port 5001, use mlflow ui -p 5001.

Starting the MLflow tracking server with mlflow ui also enables you to view traces right in a Jupyter notebook! You just have to set the tracking uri to the location specified above:

mlflow.set_tracking_uri("http://localhost:5000")

Then, when you invoke an AI model/framework with tracing enabled, the generated trace(s) will appear right in the notebook outputs.

Tracing in Jupyter Notebooks

You can disable this functionality with mlflow.tracing.disable_notebook_display().

Organizing your Traces

If you use tracing across multiple different projects and tasks, you might want to organize the traces into separate groups.

The easiest way to organize your traces is to separate them into experiments. Each experiment has its own traces tab, which displays the traces for that experiment.

You can create an experiment in the UI (With the "+" button next to "Experiments"), with the MLflow CLI, or with Python. Let's create a new experiment called "Quickstart" and log a trace.

mlflow.set_experiment("quickstart")

completion = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{
"role": "user",
"content": "What is an MLflow tracking server?"
}
]
)

We can now find this trace in the "Traces" tab in the "quickstart" experiment.

Trace in Experiment

The set_experiment function specifies which experiment traces should be logged to, creating it if it does not exist, so the code snippet above created a new "quickstart" experiment.

You can also organize your traces with tags and runs.

Tracing Other Providers

Our quickstart example focused on OpenAI, but MLflow supports automatic tracing of many different AI providers and frameworks. The approach is the same: just add the line mlflow.<provider>.autolog to your notebook or script.

Here are a few examples. See here for the full list of supported providers.

Enable automatic tracing for Anthropic model calls with mlflow.anthropic.autolog().

import anthropic
import mlflow

mlflow.anthropic.autolog()

client = anthropic.Anthropic()

message = client.messages.create(
model="claude-3-7-sonnet-20250219",
max_tokens=1000,
temperature=1,
messages=[
{
"role": "user",
"content": "What is an MLflow tracking server?"
}
]
)

This returns the following in the MLflow UI:

Anthropic tracing

Conclusion: Effective LLM tracing with one line of code

In this guide, you have learned how to use MLflow's autologging capabilities to get a complete AI observability solution with a single line of code. If you are using one of the many GenAI frameworks/providers for which MLflow offers automatic tracing—including any providers with OpenAI-compatible endpoints—automatic logging is the easiest way to visualize and debug your AI application behavior. All you need is mlflow.<provider>.autolog().

Next Steps

Autologging is a great place to start with MLflow tracing, but you may need more flexibility in how you collect and use traces as you develop more complex GenAI applications. Furthermore, MLflow includes many tools for working with GenAI applications beyond tracing.

  • For a longer conceptual introduction to tracing, read this guide on tracing concepts.
  • MLflow traces can provide an excellent source of data for evaluation, SME review, fine-tuning, and more. Learn about searching and retrieving trace data here.
  • MLflow provides LLM evaluation functionality for running structured experiments with your AI models and applications.
  • You can add tracing to your own AI applications with the tracing fluent APIs and client APIs. You can also add tracing to libraries and frameworks that do not (yet) have autologging support.