Optimizing Prompts for OpenAI Agents

This guide demonstrates how to leverage mlflow.genai.optimize_prompts() alongside the OpenAI Agent framework to enhance your agent's prompts automatically. The mlflow.genai.optimize_prompts() API is framework-agnostic, enabling you to perform end-to-end prompt optimization of your agents from any framework using state-of-the-art techniques. For more information about the API, please visit Optimize Prompts.
Prerequisites​
bash
pip install openai-agents mlflow gepa nest_asyncio
Set your OpenAI API key:
bash
export OPENAI_API_KEY="your-api-key"
Set tracking server and MLflow experiment:
python
import mlflow
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("OpenAI Agents")
Basic Example​
Here's a complete example of optimizing a question-answering agent:
python
import mlflow
from typing import Any
from agents import Agent, Runner
from mlflow.genai.optimize import GepaPromptOptimizer
from mlflow.genai.scorers import scorer
# If you're inside notebooks, please uncomment the following lines.
# import nest_asyncio
# nest_asyncio.apply()
# Step 1: Register your initial prompt
system_prompt = mlflow.genai.register_prompt(
name="qa-agent-system-prompt",
template="You're a helpful agent. Follow the user instruction precisely.",
)
user_prompt = mlflow.genai.register_prompt(
name="qa-agent-user-prompt",
template="""Answer the question based on the context provided.
Context: {{context}}
Question: {{question}}
Answer:""",
)
# Step 2: Create a prediction function
@mlflow.trace
def predict_fn(context: str, question: str) -> str:
# Load prompt from registry
system_prompt = mlflow.genai.load_prompt("prompts:/qa-agent-system-prompt@latest")
user_prompt = mlflow.genai.load_prompt("prompts:/qa-agent-user-prompt@latest")
# This is your agent
agent = Agent(
name="Question Answerer",
instructions=system_prompt.template,
model="gpt-4o-mini",
)
# Format the user message
user_message = user_prompt.format(context=context, question=question)
# Run the agent
result = Runner.run_sync(agent, user_message)
return result.final_output
# Step 3: Prepare training data
train_data = [
{
"inputs": {
"context": "Paris is the capital of France.",
"question": "What is the capital of France?",
},
"expectations": {"expected_response": "Paris"},
},
{
"inputs": {
"context": "The Eiffel Tower was completed in 1889.",
"question": "When was the Eiffel Tower completed?",
},
"expectations": {"expected_response": "1889"},
},
# Add more examples...
]
# Step 4: Prepare scorer
@scorer
def exact_match(outputs: str, expectations: dict[str, Any]) -> bool:
return outputs == expectations["expected_response"]
# Step 5: Optimize the prompt
result = mlflow.genai.optimize_prompts(
predict_fn=predict_fn,
train_data=train_data,
prompt_uris=[system_prompt.uri, user_prompt.uri],
optimizer=GepaPromptOptimizer(
reflection_model="openai:/gpt-5",
max_metric_calls=100,
),
scorers=[exact_match],
)
# Step 6: Use the optimized prompt
optimized_system_prompt = result.optimized_prompts[0]
print(f"Optimized system prompt URI: {optimized_system_prompt.uri}")
# Since your agent already use @latest, it will automatically use the optimized prompts.
predict_fn(
context="MLflow is an open-source platform for managing the machine learning lifecycle, providing tools to streamline the development, training, and deployment of models",
question="What is MLflow",
)