Tracing LangChain🦜⛓️

LangChain is an open-source framework for building LLM-powered applications.
MLflow Tracing provides automatic tracing capability for LangChain. You can enable tracing
for LangChain by calling the mlflow.langchain.autolog() function, and nested traces are automatically logged to the active MLflow Experiment upon invocation of chains. In TypeScript, you can pass the MLflow LangChain callback to the callbacks option.
- Python
- JS / TS
import mlflow
mlflow.langchain.autolog()
import { MlflowCallback } from "mlflow-langchain";
const tracer = new MlflowCallback();
await agent.invoke(
{ messages: [{ role: "user", content: "What is MLflow?" }] },
{ callbacks: [tracer] }
);
MLflow LangChain integration is not only about tracing. MLflow offers full tracking experience for LangChain, including model tracking, prompt management, and evaluation. Please checkout the MLflow LangChain Flavor to learn more!
Example Usage​
- Python
- JS / TS
import mlflow
import os
from langchain.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_openai import ChatOpenAI
# Enabling autolog for LangChain will enable trace logging.
mlflow.langchain.autolog()
# Optional: Set a tracking URI and an experiment
mlflow.set_experiment("LangChain")
mlflow.set_tracking_uri("http://localhost:5000")
llm = ChatOpenAI(model="gpt-4o-mini", temperature=0.7, max_tokens=1000)
prompt_template = PromptTemplate.from_template(
"Answer the question as if you are {person}, fully embodying their style, wit, personality, and habits of speech. "
"Emulate their quirks and mannerisms to the best of your ability, embracing their traits—even if they aren't entirely "
"constructive or inoffensive. The question is: {question}"
)
chain = prompt_template | llm | StrOutputParser()
# Let's test another call
chain.invoke(
{
"person": "Linus Torvalds",
"question": "Can I just set everyone's access to sudo to make things easier?",
}
)
import * as mlflow from "mlflow-tracing";
import { MlflowCallback } from "mlflow-langchain";
import { ChatOpenAI } from "@langchain/openai";
import { StringOutputParser } from "@langchain/core/output_parsers";
import { PromptTemplate } from "@langchain/core/prompts";
import { RunnableSequence } from '@langchain/core/runnables';
mlflow.init({
trackingUri: process.env.MLFLOW_TRACKING_URI,
experimentId: process.env.MLFLOW_EXPERIMENT_ID,
});
// Define a chain as usual
const llm = new ChatOpenAI({ model: "gpt-4o-mini", temperature: 0.7, maxTokens: 1000 });
const prompt = PromptTemplate.fromTemplate(
"Answer the question as if you are {person}. The question is: {question}"
);
const chain = RunnableSequence.from([prompt, llm, new StringOutputParser()]);
const tracer = new MlflowCallback();
await chain.invoke(
{
person: "Linus Torvalds",
question: "Can I just set everyone's access to sudo to make things easier?",
},
{ callbacks: [tracer] }
);
This example above has been confirmed working with the following requirement versions:
pip install openai==1.30.5 langchain==0.2.1 langchain-openai==0.1.8 langchain-community==0.2.1 mlflow==2.14.0 tiktoken==0.7.0
Supported APIs​
The following APIs are supported by auto tracing for LangChain.
invokebatchstreamainvokeabatchastreamget_relevant_documents(for retrievers)__call__(for Chains and AgentExecutors)
Token Usage Tracking​
MLflow >= 3.1.0 supports token usage tracking for LangChain. The token usage for each LLM call during a chain invocation will be logged in the mlflow.chat.tokenUsage span attribute, and the total usage in the entire trace will be logged in the mlflow.trace.tokenUsage metadata field.
- Python
- JS / TS
import json
import mlflow
mlflow.langchain.autolog()
# Execute the chain defined in the previous example
chain.invoke(
{
"person": "Linus Torvalds",
"question": "Can I just set everyone's access to sudo to make things easier?",
}
)
# Get the trace object just created
last_trace_id = mlflow.get_last_active_trace_id()
trace = mlflow.get_trace(trace_id=last_trace_id)
# Print the token usage
total_usage = trace.info.token_usage
print("== Total token usage: ==")
print(f" Input tokens: {total_usage['input_tokens']}")
print(f" Output tokens: {total_usage['output_tokens']}")
print(f" Total tokens: {total_usage['total_tokens']}")
# Print the token usage for each LLM call
print("\n== Token usage for each LLM call: ==")
for span in trace.data.spans:
if usage := span.get_attribute("mlflow.chat.tokenUsage"):
print(f"{span.name}:")
print(f" Input tokens: {usage['input_tokens']}")
print(f" Output tokens: {usage['output_tokens']}")
print(f" Total tokens: {usage['total_tokens']}")
import * as mlflow from "mlflow-tracing";
// After your LangChain call completes, flush and fetch the trace
await mlflow.flushTraces();
const lastTraceId = mlflow.getLastActiveTraceId();
if (lastTraceId) {
const client = new mlflow.MlflowClient({ trackingUri: "http://localhost:5000" });
const trace = await client.getTrace(lastTraceId);
// Total token usage on the trace
console.log("== Total token usage: ==");
console.log(trace.info.tokenUsage); // { input_tokens, output_tokens, total_tokens }
// Per-span usage for each LLM call in the chain
console.log("\n== Token usage for each LLM call: ==");
for (const span of trace.data.spans) {
const usage = span.attributes?.["mlflow.chat.tokenUsage"];
if (usage) {
console.log(`${span.name}:`, usage);
}
}
}
== Total token usage: ==
Input tokens: 81
Output tokens: 257
Total tokens: 338
== Token usage for each LLM call: ==
ChatOpenAI:
Input tokens: 81
Output tokens: 257
Total tokens: 338
Customize Tracing Behavior​
Sometimes you may want to customize what information is logged in the traces. You can achieve this by creating a custom callback handler that inherits from MlflowLangchainTracer. MlflowLangchainTracer is a callback handler that is injected into the langchain model inference process to log traces automatically. It starts a new span upon a set of actions of the chain such as on_chain_start, on_llm_start, and concludes it when the action is finished. Various metadata such as span type, action name, input, output, latency, are automatically recorded to the span.
The following example demonstrates how to record an additional attribute to the span when a chat model starts running.
from mlflow.langchain.langchain_tracer import MlflowLangchainTracer
class CustomLangchainTracer(MlflowLangchainTracer):
# Override the handler functions to customize the behavior. The method signature is defined by LangChain Callbacks.
def on_chat_model_start(
self,
serialized: Dict[str, Any],
messages: List[List[BaseMessage]],
*,
run_id: UUID,
tags: Optional[List[str]] = None,
parent_run_id: Optional[UUID] = None,
metadata: Optional[Dict[str, Any]] = None,
name: Optional[str] = None,
**kwargs: Any,
):
"""Run when a chat model starts running."""
attributes = {
**kwargs,
**metadata,
# Add additional attribute to the span
"version": "1.0.0",
}
# Call the _start_span method at the end of the handler function to start a new span.
self._start_span(
span_name=name or self._assign_span_name(serialized, "chat model"),
parent_run_id=parent_run_id,
span_type=SpanType.CHAT_MODEL,
run_id=run_id,
inputs=messages,
attributes=kwargs,
)
Disable auto-tracing​
Auto tracing for LangChain can be disabled globally by calling mlflow.langchain.autolog(disable=True) or mlflow.autolog(disable=True).