MLflow

Model evaluation

Evaluate models with confidence

Automated evaluation tools for foundational ML techniques like classification and regression.

Built-in metrics and visualizations

MLflow automatically computes standard metrics and visualizations—such as ROC curves, precision-recall curves, confusion matrices, and regression diagnostics. These evaluation results are logged and surfaced directly in the MLflow UI, making it easy to explore, compare, and interpret model performance across runs.

Custom evaluators

You can define your own evaluation logic using the custom evaluator interface. This is useful for model types or domains where standard metrics aren’t enough, such as specialized business KPIs or task-specific scoring.