Safety Judge

The Safety judge assesses the safety of given content (whether generated by the application or provided by a user), checking for harmful, unethical, or inappropriate material.

The Safety judge evaluates text content to identify potentially harmful, offensive, or inappropriate material. It returns a pass/fail assessment along with a detailed rationale explaining the safety concerns (if any).

Prerequisites for running the examples

Install MLflow and required packages
bash
```
pip install --upgrade mlflow
```
Create an MLflow experiment by following the setup your environment quickstart.

(Optional, if using OpenAI models) Use the native OpenAI SDK to connect to OpenAI-hosted models. Select a model from the available OpenAI models.

python
import mlflow
import os
import openai

# Ensure your OPENAI_API_KEY is set in your environment
# os.environ["OPENAI_API_KEY"] = "<YOUR_API_KEY>" # Uncomment and set if not globally configured

# Enable auto-tracing for OpenAI
mlflow.openai.autolog()

# Create an OpenAI client
client = openai.OpenAI()

# Select an LLM
model_name = "gpt-4o-mini"

Usage examples

The Safety judge can be invoked directly for single assessment or used with MLflow's evaluation framework for batch evaluation.

Invoke directly
Invoke with evaluate()

python
from mlflow.genai.scorers import Safety

# Assess the safety of a single output
assessment = Safety()(
    outputs="MLflow is an open-source platform for managing the ML lifecycle, including experimentation, reproducibility, and deployment."
)
print(assessment)

python
import mlflow
from mlflow.genai.scorers import Safety

# Create evaluation dataset with various safety scenarios
eval_dataset = [
    {
        "inputs": {"query": "Tell me about MLflow"},
        "outputs": {
            "response": "MLflow is an open-source platform for managing the ML lifecycle, including experimentation, reproducibility, and deployment."
        },
    },
]

# Run evaluation with Safety judge
eval_results = mlflow.genai.evaluate(
    data=eval_dataset,
    scorers=[
        Safety(
            model="openai:/gpt-4o-mini",  # Optional.
        ),
    ],
)

Select the LLM that powers the judge

You can change the judge model by using the model argument in the judge definition. The model must be specified in the format <provider>:/<model-name>, where <provider> is a LiteLLM-compatible model provider.

For a list of supported models, see selecting judge models.

Safety Judge

Prerequisites for running the examples

Usage examples

Select the LLM that powers the judge

Next steps

Explore other built-in judges

Create custom safety guidelines

Evaluate Agents

Prerequisites for running the examples​

Usage examples​

Select the LLM that powers the judge​

Next steps​

Explore other built-in judges

Create custom safety guidelines

Evaluate Agents

Prerequisites for running the examples

Usage examples

Select the LLM that powers the judge

Next steps