MLflow 2.12.1 includes several major features and improvements
With this release, we're pleased to introduce several major new features that are focused on enhanced GenAI support, Deep Learning workflows involving images, expanded table logging functionality, and general usability enhancements within the UI and external integrations.
Major New Features
PromptFlow
Introducing the new PromptFlow flavor, designed to enrich the GenAI landscape within MLflow. This feature simplifies the creation and management of dynamic prompts, enhancing user interaction with AI models and streamlining prompt engineering processes. (#11311, #11385 @brynn-code)
Enhanced Metadata Sharing for Unity Catalog
MLflow now supports the ability to share metadata (and not model weights) within Databricks Unity Catalog. When logging a model, this functionality enables the automatic duplication of metadata into a dedicated subdirectory, distinct from the model’s actual storage location, allowing for different sharing permissions and access control limits. (#11357, #11720 @WeichenXu123)
Code Paths Unification and Standardization
We have unified and standardized the code_paths
parameter across all MLflow flavors to ensure a cohesive and streamlined user experience. This change promotes consistency and reduces complexity in the model deployment lifecycle. (#11688, @BenWilson2)
ChatOpenAI and AzureChatOpenAI Support
Support for the ChatOpenAI and AzureChatOpenAI interfaces has been integrated into the LangChain flavor, facilitating seamless deployment of conversational AI models. This development opens new doors for building sophisticated and responsive chat applications leveraging cutting-edge language models. (#11644, @B-Step62)
Custom Models in Sentence-Transformers
The sentence-transformers flavor now supports custom models, allowing for a greater flexibility in deploying tailored NLP solutions. (#11635, @B-Step62)
Native MLflow Image support in the log_image
API
Added support for optimized image logging, including step-based iterative logging for images generated as part of a training run. This feature enables the ability to track your image generation, classification, segmentation, enhancement and object detection deep learning models effortlessly. (#11243, #11404, @jessechancy)
Image Support for Log Table
With the addition of image support in log_table
, MLflow enhances its capabilities in handling rich media. This functionality allows for direct logging and visualization of images within the platform, improving the interpretability and analysis of visual data. (#11535, @jessechancy)
Streaming Support for LangChain
The newly introduced predict_stream
API for LangChain models supports streaming outputs, enabling real-time output for chain invocation via pyfunc. This feature is pivotal for applications requiring continuous data processing and instant feedback. (#11490, #11580 @WeichenXu123)
Security Fixes
LFI Security Patch
Addressed a critical Local File Read/Path Traversal vulnerability within the Model Registry, ensuring robust protection against unauthorized access and securing user data integrity. (#11376, @WeichenXu123)
Detailed Release Notes
Features:
- [Models] Add the PromptFlow flavor (#11311, #11385 @brynn-code)
- [Models] Add a new
predict_stream
API for streamable output for Langchain models and theDatabricksDeploymentClient
(#11490, #11580 @WeichenXu123) - [Models] Deprecate and add
code_paths
alias forcode_path
inpyfunc
to be standardized to other flavor implementations (#11688, @BenWilson2) - [Models] Add support for custom models within the
sentence-transformers
flavor (#11635, @B-Step62) - [Models] Enable Spark
MapType
support within model signatures when used with Spark udf inference (#11265, @WeichenXu123) - [Models] Add support for metadata-only sharing within Unity Catalog through the use of a subdirectory (#11357, #11720 @WeichenXu123)
- [Models] Add Support for the
ChatOpenAI
andAzureChatOpenAI
LLM interfaces within the LangChain flavor (#11644, @B-Step62) - [Artifacts] Add support for utilizing presigned URLs when uploading and downloading files when using Unity Catalog (#11534, @artjen)
- [Artifacts] Add a new
Image
object for handling the logging and optimized compression of images (#11404, @jessechancy) - [Artifacts] Add time and step-based metadata to the logging of images (#11243, @jessechancy)
- [Artifacts] Add the ability to log a dataset to Unity Catalog by means of
UCVolumeDatasetSource
(#11301, @chenmoneygithub) - [Tracking] Remove the restrictions for logging a table in Delta format to no longer require running within a Databricks environment (#11521, @chenmoneygithub)
- [Tracking] Add support for logging
mlflow.Image
files within tables (#11535, @jessechancy) - [Server-infra] Introduce override configurations for controlling how http retries are handled (#11590, @BenWilson2)
- [Deployments] Implement
chat
&chat streaming
for Anthropic within the MLflow deployments server (#11195, @gabrielfu)
Security fixes:
- [Model Registry] Fix Local File Read/Path Traversal (LFI) bypass vulnerability (#11376, @WeichenXu123)
Bug fixes:
- [Model Registry] Fix a registry configuration error that occurs within Databricks serverless clusters (#11719, @WeichenXu123)
- [Model Registry] Delete registered model permissions when deleting the underlying models (#11601, @B-Step62)
- [Model Registry] Disallow
%
in model names to prevent URL mangling within the UI (#11474, @daniellok-db) - [Models] Fix an issue where crtically important environment configurations were not being captured as langchain dependencies during model logging (#11679, @serena-ruan)
- [Models] Patch the
LangChain
loading functions to handle uncorrectable pickle-related exceptions that are thrown when loading a model in certain versions (#11582, @B-Step62) - [Models] Fix a regression in the
sklearn
flavor to reintroduce support for custom prediction methods (#11577, @B-Step62) - [Models] Fix an inconsistent and unreliable implementation for batch support within the
langchain
flavor (#11485, @WeichenXu123) - [Models] Fix loading remote-code-dependent
transformers
models that contain custom code (#11412, @daniellok-db) - [Models] Remove the legacy conversion logic within the
transformers
flavor that generates an inconsistent input example display within the MLflow UI (#11508, @B-Step62) - [Models] Fix an issue with Keras autologging iteration input handling (#11394, @WeichenXu123)
- [Models] Fix an issue with
keras
autologging training dataset generator (#11383, @WeichenXu123) - [Tracking] Fix an issue where a module would be imported multiple times when logging a langchain model (#11553, @sunishsheth2009)
- [Tracking] Fix the sampling logic within the
GetSampledHistoryBulkInterval
API to produce more consistent results when displayed within the UI (#11475, @daniellok-db) - [Tracking] Fix import issues and properly resolve dependencies of
langchain
andlanchain_community
withinlangchain
models when logging (#11450, @sunishsheth2009) - [Tracking] Improve the performance of asynchronous logging (#11346, @chenmoneygithub)
- [Deployments] Add middle-of-name truncation to excessively long deployment names for Sagemaker image deployment (#11523, @BenWilson2)
Documentation updates:
- [Docs] Add clarity and consistent documentation for
code_paths
docstrings in API documentation (#11675, @BenWilson2) - [Docs] Add documentation guidance for
sentence-transformers
OpenAI
-compatible API interfaces (#11373, @es94129)
For a comprehensive list of changes, see the release change log, and check out the latest documentation on mlflow.org.