MLflow DSPy Flavor
Attention
The dspy
flavor is under active development and is marked as Experimental. Public APIs are
subject to change and new features may be added as the flavor evolves.
Introduction
DSPy is a framework for algorithmically optimizing LM prompts and weights. It’s designed to improve the process of prompt engineering by replacing hand-crafted prompt strings with modular components. These modules are concise, well-defined, and maintain high quality and expressive power, making prompt creation more efficient and scalable. By parameterizing these modules and treating prompting as an optimization problem, DSPy can adapt better to different language models, potentially outperforming prompts crafted by experts. This modularity also enables easier exploration of complex pipelines, allowing for fine-tuning performance based on specific tasks or nuanced metrics.
Why use DSPy with MLflow?
The native integration of the DSPy library with MLflow helps users manage the development lifecycle with DSPy. The following are some of the key benefits of using DSPy with MLflow:
MLflow Tracking allows you to track your DSPy program’s training and execution. With the MLflow APIs, you can log a variety of artifacts and organize training runs, thereby increasing visibility into your model performance.
MLflow Model packages your compiled DSPy program along with its dependency versions, input and output interfaces and other essential metadata. This allows you to deploy your compiled DSPy program with ease, knowing that the environment is consistent across different stages of the ML lifecycle.
MLflow Evaluate provides native capabilities within MLflow to evaluate GenAI applications. This capability facilitates the efficient assessment of inference results from your DSPy compiled program, ensuring robust performance analytics and facilitating quick iterations.
Getting Started
In this introductory tutorial, you will learn the most fundamental components of DSPy and how to leverage the integration with MLflow to store, retrieve, and use a DSPy program.
Concepts
Module
Modules are components that handle specific text transformations, like answering questions or summarizing. They replace traditional hand-written prompts and can learn from examples, making them more adaptable.
Signature
A signature is a natural language description of a module’s input and output behavior. For example, “question -> answer” specifies that the module should take a question as input and return an answer.
Optimizer
A optimizer improves LM pipelines by adjusting modules to meet a performance metric, either by generating better prompts or fine-tuning models.
Usage
Saving and Loading DSPy Program in MLflow Experiment
Creating a DSPy Program
The Module object is the centerpiece of the DSPy and MLflow integration. With DSPy, you can create complex agentic logic via a module or set of modules.
pip install mlflow dspy -U
import dspy
# Define our language model
lm = dspy.LM(model="openai/gpt-4o-mini", max_tokens=250)
dspy.settings.configure(lm=lm)
# Define a Chain of Thought module
class CoT(dspy.Module):
def __init__(self):
super().__init__()
self.prog = dspy.ChainOfThought("question -> answer")
def forward(self, question):
return self.prog(question=question)
dspy_model = CoT()
Tip
Typically you’d want to leverage a compiled DSPy module. MLflow will natively supports logging both compiled and uncompiled DSPy modules. Above we show an uncompiled version for simplicity, but in production you’d want to leverage an optimizer and log the outputted object instead.
Logging the Program to MLflow
You can log the dspy.Module
object to an MLflow run using the
mlflow.dspy.log_model()
function.
We will also specify a model signature. An MLflow model signature defines the expected schema for model inputs and outputs, ensuring consistency and correctness during model inference.
import mlflow
# Start an MLflow run
with mlflow.start_run():
# Log the model
model_info = mlflow.dspy.log_model(
dspy_model,
artifact_path="model",
input_example="what is 2 + 2?",
)
Loading the Module for inference
The saved module can be loaded back for inference using the mlflow.pyfunc.load_model()
function. This function
gives an MLflow Python Model backed by the DSPy module.
import mlflow
# Load the model as an MLflow PythonModel
model = mlflow.pyfunc.load_model(model_info.model_uri)
# Predict with the object
response = model.predict("What kind of bear is best?")
print(response)
{
'reasoning': '''The question "What kind of bear is best?" is often associated with a
humorous reference from the television show "The Office," where the character Jim
Halpert jokingly states, "Bears, beets, Battlestar Galactica." However, if we consider
the question seriously, it depends on the context. Different species of bears have
different characteristics and adaptations that make them "best" in various ways.
For example, the American black bear is known for its adaptability, while the polar bear is
the largest land carnivore and is well adapted to its Arctic environment. Ultimately, the
answer can vary based on personal preference or specific criteria such as strength,
intelligence, or adaptability.''',
'answer': '''There isn\'t a definitive answer, as it depends on the context. However, many
people humorously refer to the American black bear or the polar bear when discussing
"the best" kind of bear.'''
}
To load the DSPy program itself back instead of the PyFunc-wrapped model, use the mlflow.dspy.load_model()
function.
model = mlflow.dspy.load_model(model_uri)
FAQ
How can I save a compiled vs. uncompiled model?
DSPy compiles models by updating various LLM parameters, such as prompts, hyperparameters, and model weights, to optimize training. While MLflow allows logging both compiled and uncompiled models, it’s generally preferable to use a compiled model, as it is expected to perform better in practice.
What can be serialized by MLflow?
When using mlflow.dspy.log_model()
or mlflow.dspy.save_model()
in MLflow, the
DSPy program is serialized and saved to the tracking server as a .pkl
file. This enables easy
deployment. Under the hood, MLflow uses cloudpickle
to serialize the DSPy object, but some
DSPy artifacts are note serializable. Relevant examples are listed below.
API tokens. These should be managed separately and passed securely via environment variables.
The DSPy trace object, which is primarily used during training, not inference.
How do I manage secrets?
When serializing using the MLflow DSPy flavor, tokens are dropped from the settings objects. It is the user’s responsibility to securely pass the required secrets to the deployment environment.
How is the DSPy settings
object saved?
To ensure program reproducibility, the service context is converted to a Python dictionary and pickled with the model artifact. Service context is a concept that has been popularized in GenAI frameworks. Put simply, it stores a configuration that is global to your project. For DSPy specifically, we can set information such as the language model, reranker, adapter, etc.
DSPy stores this service context in a Settings
singleton class. Sensitive API access keys that
are set within the Settings
object are not persisted when logging your model. When deploying
your DSPy model, you must ensure that the deployment environment has these keys set so that your
DSPy model can make remote calls to services that require access keys.