MLflow 3.5.0
ยท 5 min read
MLflow 3.5.0 includes several major features and improvements!
Major Featuresโ
- โ๏ธ Job Execution Backend: Introduced a new job execution backend infrastructure for running asynchronous tasks with individual execution pools, job search capabilities, and transient error handling. (#17676, #18012, #18070, #18071, #18112, #18049, @WeichenXu123)
- ๐ฏ Flexible Prompt Optimization API: Introduced a new flexible API for prompt optimization with support for model switching and the GEPA algorithm, enabling more efficient prompt tuning with fewer rollouts. See the documentation to get started. (#18183, #18031, @TomeHirata)
- ๐จ Enhanced UI Onboarding: Improved in-product onboarding experience with trace quickstart drawer and updated homepage guidance to help users discover MLflow's latest features. (#18098, #18187, @B-Step62)
- ๐ Security Middleware for Tracking Server: Added a security middleware layer to protect against DNS rebinding, CORS attacks, and other security threats. Read the documentation for configuration details. (#17910, @BenWilson2)
Featuresโ
- [Tracing / Tracking] Add
unlink_traces_from_run
batch operation (#18316, @harupy) - [Tracing] Add batch trace link/unlink operations to DatabricksTracingRestStore (#18295, @harupy)
- [Tracking] Claude Code SDK autologging support (#18022, @smoorjani)
- [Tracing] Add support for reading trace configuration from environment variables (#17792, @joelrobin18)
- [Tracking] Mistral tracing improvements (#16370, @joelrobin18)
- [Tracking] Gemini token count tracking (#16248, @joelrobin18)
- [Tracking] Gemini streaming support (#16249, @joelrobin18)
- [Tracking] CrewAI token count tracking with documentation updates (#16373, @joelrobin18)
- [Evaluation] Allow passing empty scorer list for manual result comparison (#18265, @B-Step62)
- [Evaluation] Log assessments to DSPy evaluation traces (#18136, @B-Step62)
- [Evaluation] Add support for trace inputs to built-in scorers (#17943, @BenWilson2)
- [Evaluation] Add synonym handling for built-in scorers (#17980, @BenWilson2)
- [Evaluation] Add span timing tool for Agent Judges (#17948, @BenWilson2)
- [Evaluation] Allow disabling evaluation sample check (#18032, @B-Step62)
- [Evaluation] Reduce verbosity of SIMBA optimizer logs when aligning judges (#17795, @BenWilson2)
- [Evaluation] Add
__repr__
method for Judges (#17794, @BenWilson2) - [Prompts] Add prompt registry support to MLflow webhooks (#17640, @harupy)
- [Prompts] Prompt Registry Chat UI (#17334, @joelrobin18)
- [UI] Delete parent and child runs together (#18052, @joelrobin18)
- [UI] Added move to top, move to bottom for charts (#17742, @joelrobin18)
- [Tracking] Use sampling data for run comparison to improve performance (#17645, @lkuo)
- [Tracking] Add optional 'outputs' column for evaluation dataset records (#17735, @WeichenXu123)
Bug Fixesโ
- [Tracing] Fix parent run resolution mechanism for LangChain (#17273, @B-Step62)
- [Tracing] Add client-side retry for
get_trace
to improve reliability (#18224, @B-Step62) - [Tracing] Fix OpenTelemetry dual export (#18163, @B-Step62)
- [Tracing] Suppress false warnings from span logging (#18092, #18276, @B-Step62)
- [Tracing] Fix OpenTelemetry resource attributes not propagating correctly (#18019, @xiaosha007)
- [Tracing] Fix DSPy prompt display (#17988, @B-Step62)
- [Tracing] Fix usage aggregation to avoid ancestor duplication (#17921, @TomeHirata)
- [Tracing] Fix double counting in Strands tracing (#17855, @joelrobin18)
- [Tracing] Fix
to_predict_fn
to handle traces without tags field (#17784, @harupy) - [Tracing] URL-encode trace tag keys in
delete_trace_tag
to prevent 404 errors (#18232, @copilot-swe-agent) - [Tracking] Fix Claude Code autologging inputs not displaying (#17858, @smoorjani)
- [Tracking] Fix runs with 0-valued metrics not appearing in experiment list contour plots (#17916, @WeichenXu123)
- [Tracking] Fix DSPy run display (#18137, @B-Step62)
- [Tracking] Allow list of types in tools JSON Schema for OpenAI autolog (#17908, @fedem96)
- [Tracking] Set tracking URI environment variable for job runner (#18073, @WeichenXu123)
- [Evaluation] Add atomicity to
job_start
API (#18226, @BenWilson2) - [Evaluation] Fix trace ingest for outputs in
merge_records()
API (#18047, @BenWilson2) - [Evaluation] Fix judge regression (#18039, @B-Step62)
- [Evaluation] Fix judges to use non-empty user messages for Anthropic model compatibility (#17935, @dbczumar)
- [Evaluation] Fix endpoints error in judge (#18048, @joelrobin18)
- [Model Registry] Fix creating model versions from non-Databricks tracking to Databricks Unity Catalog registry (#18244, @austinwarner-8451)
- [Model Registry] Fix registry URI instantiation for artifact download (#17982, @arpitjasa-db)
- [Model Registry] Include original error details in Unity Catalog model copy failure messages (#17997, @harupy)
- [Model Registry] Fix webhook delivery to exit early for FileStore instances (#18015, @copilot-swe-agent)
- [Prompts] Fix error suppression during prompt alias resolution when
allow_missing
is set (#17541, @mr-brobot) - [UI] General UI improvements (#18281, @joelrobin18)
- [Models] Fix dataset issue (#18081, @joelrobin18)
- [Models] Forward dataset name and digest to PolarsDataset's
to_evaluation_dataset
method (#17886, @sadelcarpio) - [Build] Fix
mlflow server
exiting immediately when optionalhuey
package is missing (#18016, @harupy) - [Scoring] Fix chat completion arguments (#18248, @aravind-segu)
Documentation Updatesโ
- [Docs] Add self-hosted documentation support (#17986, @B-Step62)
- [Docs] Add GitHub feature requests section to GenAI documentation (#18342, @TomeHirata)
- [Docs] Update Claude Code SDK tracing documentation (#18026, @smoorjani)
- [Docs] Add documentation for Analyze Experiment MCP/CLI command (#17978, @nsthorat)
- [Docs] Add deprecation notice for custom prompt judge (#18287, @smoorjani)
- [Docs] Overhaul scorer documentation (#17930, @B-Step62)
- [Docs] Add default optimizer documentation (#17814, @BenWilson2)
- [Docs] Update TypeScript SDK contribution documentation (#17995, @joelrobin18)
- [Docs] Fix Postgres 18+ mount path in documentation (#18192, @soyun11)
- [Docs] Fix typo: correct variable name from
max_few_show_examples
tomax_few_shot_examples
(#18246, @srinathmkce) - [Docs] Replace single quotes with double quotes for Windows compatibility (#18266, @PavithraNelluri)
- [Docs] Fix typo in model registry documentation (#18038, @EddieMG)
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.