mlflow.gateway

class mlflow.gateway.MlflowGatewayClient(gateway_uri: Optional[str] = None)[source]

Client for interacting with the MLflow Gateway API.

Parameters

gateway_uri – Optional URI of the gateway. If not provided, attempts to resolve from first the stored result of set_gateway_uri(), then the environment variable MLFLOW_GATEWAY_URI.

create_route(name: str, route_type: Optional[str] = None, model: Optional[Dict[str, Any]] = None)mlflow.gateway.config.Route[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Create a new route in the Gateway.

Warning

This API is only available when running within Databricks. When running elsewhere, route configuration is handled via updates to the route configuration YAML file that is specified during Gateway server start.

Parameters
  • name – The name of the route. This parameter is required for all routes.

  • route_type – The type of the route (e.g., ‘llm/v1/chat’, ‘llm/v1/completions’, ‘llm/v1/embeddings’). This parameter is required for routes that are not managed by Databricks (the provider isn’t ‘databricks’).

  • model

    A dictionary representing the model details to be associated with the route. This parameter is required for all routes. This dictionary should define:

    • The model name (e.g., “gpt-4o-mini”)

    • The provider (e.g., “openai”, “anthropic”)

    • The configuration for the model used in the route

Returns

A serialized representation of the Route data structure, providing information about the name, type, and model details for the newly created route endpoint.

Raises

mlflow.MlflowException – If the function is not running within Databricks.

Note

See the official Databricks documentation for MLflow Gateway for examples of supported model configurations and how to dynamically create new routes within Databricks.

Example usage from within Databricks:

from mlflow.gateway import MlflowGatewayClient

gateway_client = MlflowGatewayClient("databricks")

openai_api_key = ...

new_route = gateway_client.create_route(
    name="my-route",
    route_type="llm/v1/completions",
    model={
        "name": "question-answering-bot",
        "provider": "openai",
        "openai_config": {
            "openai_api_key": openai_api_key,
        },
    },
)
delete_route(name: str)None[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Delete an existing route in the Gateway.

Warning

This API is only available when running within Databricks. When running elsewhere, route deletion is handled by removing the corresponding entry from the route configuration YAML file that is specified during Gateway server start.

Parameters

name – The name of the route to delete.

Raises

mlflow.MlflowException – If the function is not running within Databricks.

Example usage from within Databricks:

from mlflow.gateway import MlflowGatewayClient

gateway_client = MlflowGatewayClient("databricks")
gateway_client.delete_route("my-existing-route")
property gateway_uri

Get the current value for the URI of the MLflow Gateway.

Returns

The gateway URI.

get_limits(route: str)mlflow.gateway.config.LimitsConfig[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Get limits of an existing route in the Gateway.

Warning

This API is only available when connected to a Databricks-hosted AI Gateway.

Parameters

route – The name of the route to get limits of.

Returns

The returned data structure is a serialized representation of the Limit data structure, giving information about the renewal_period, key, and calls.

Example usage:

from mlflow.gateway import MlflowGatewayClient

gateway_client = MlflowGatewayClient("databricks")

gateway_client.get_limits("my-new-route")
get_route(name: str)[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Get a specific query route from the gateway. The routes that are available to retrieve are only those that have been configured through the MLflow Gateway Server configuration file (set during server start or through server update commands).

Parameters

name – The name of the route.

Returns

The returned data structure is a serialized representation of the Route data structure, giving information about the name, type, and model details (model name and provider) for the requested route endpoint.

query(route: str, data: Dict[str, Any])[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Submit a query to a configured provider route.

Parameters
  • route – The name of the route to submit the query to.

  • data

    The data to send in the query. A dictionary representing the per-route specific structure required for a given provider.

    For chat, the structure should be:

    from mlflow.gateway import MlflowGatewayClient
    
    gateway_client = MlflowGatewayClient("http://my.gateway:8888")
    
    response = gateway_client.query(
        "my-chat-route",
        {
            "messages": [
                {"role": "user", "content": "Tell me a joke about rabbits"},
            ]
        },
    )
    

    For completions, the structure should be:

    from mlflow.gateway import MlflowGatewayClient
    
    gateway_client = MlflowGatewayClient("http://my.gateway:8888")
    
    response = gateway_client.query(
        "my-completions-route", {"prompt": "It's one small step for"}
    )
    

    For embeddings, the structure should be:

    from mlflow.gateway import MlflowGatewayClient
    
    gateway_client = MlflowGatewayClient("http://my.gateway:8888")
    
    response = gateway_client.query(
        "my-embeddings-route",
        {"text": ["It was the best of times", "It was the worst of times"]},
    )
    

    Additional parameters that are valid for a given provider and route configuration can be included with the request as shown below, using an openai completions route request as an example:

    from mlflow.gateway import MlflowGatewayClient
    
    gateway_client = MlflowGatewayClient("http://my.gateway:8888")
    
    response = gateway_client.query(
        "my-completions-route",
        {
            "prompt": "Give me an example of a properly formatted pytest unit test",
            "temperature": 0.3,
            "max_tokens": 500,
        },
    )
    

Returns

The route’s response as a dictionary, standardized to the route type.

search_routes(page_token: Optional[str] = None)PagedList[mlflow.gateway.config.Route][source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Search for routes in the Gateway.

Parameters

page_token – Token specifying the next page of results. It should be obtained from a prior search_routes() call.

Returns

Returns a list of all configured and initialized Route data for the MLflow Gateway Server. The return will be a list of dictionaries that detail the name, type, and model details of each active route endpoint.

set_limits(route: str, limits: List[Dict[str, Any]])mlflow.gateway.config.LimitsConfig[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Set limits on an existing route in the Gateway.

Warning

This API is only available when running within Databricks.

Parameters
  • route – The name of the route to set limits on.

  • limits

    Limits (Array of dictionary) to set on the route. Each limit is defined by a dictionary representing the limit details to be associated with the route. This dictionary should define:

    • renewal_period: a string representing the length of the window to enforce limit on (only supports “minute” for now).

    • calls: a non-negative integer representing the number of calls allowed per renewal_period (e.g., 10, 0, 55).

    • key: an optional string represents per route limit or per user limit (“user” for per user limit, “route” for per route limit, if not supplied, default to per route limit).

Returns

The returned data structure is a serialized representation of the Limit data structure, giving information about the renewal_period, key, and calls.

Example usage:

from mlflow.gateway import MlflowGatewayClient

gateway_client = MlflowGatewayClient("databricks")

gateway_client.set_limits(
    "my-new-route", [{"key": "user", "renewal_period": "minute", "calls": 50}]
)
mlflow.gateway.create_route(name: str, route_type: Optional[str] = None, model: Optional[Dict[str, Any]] = None)mlflow.gateway.config.Route[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Create a new route in the Gateway.

Warning

This API is only available when running within Databricks. When running elsewhere, route configuration is handled via updates to the route configuration YAML file that is specified during Gateway server start.

Parameters
  • name – The name of the route. This parameter is required for all routes.

  • route_type – The type of the route (e.g., ‘llm/v1/chat’, ‘llm/v1/completions’, ‘llm/v1/embeddings’). This parameter is required for routes that are not managed by Databricks (the provider isn’t ‘databricks’).

  • model – A dictionary representing the model details to be associated with the route. This parameter is required for all routes. This dictionary should define: - The model name (e.g., “gpt-4o-mini”) - The provider (e.g., “openai”, “anthropic”) - The configuration for the model used in the route

Returns

A serialized representation of the Route data structure, providing information about the name, type, and model details for the newly created route endpoint.

Note

See the official Databricks documentation for MLflow Gateway for examples of supported model configurations and how to dynamically create new routes within Databricks.

Example usage from within Databricks:

from mlflow.gateway import set_gateway_uri, create_route

set_gateway_uri(gateway_uri="databricks")

openai_api_key = ...

create_route(
    name="my-route",
    route_type="llm/v1/completions",
    model={
        "name": "question-answering-bot",
        "provider": "openai",
        "openai_config": {
            "openai_api_key": openai_api_key,
        },
    },
)
mlflow.gateway.delete_route(name: str)None[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Delete an existing route in the Gateway.

Warning

This API is only available when running within Databricks. When running elsewhere, route deletion is handled by removing the corresponding entry from the route configuration YAML file that is specified during Gateway server start.

Parameters

name – The name of the route to delete.

Example usage from within Databricks:

from mlflow.gateway import set_gateway_uri, delete_route

set_gateway_uri(gateway_uri="databricks")

delete_route("my-new-route")
mlflow.gateway.get_gateway_uri()str[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Returns the currently set MLflow AI Gateway server uri iff set. If the Gateway uri has not been set by using set_gateway_uri, an MlflowException is raised.

mlflow.gateway.get_limits(route: str)mlflow.gateway.config.LimitsConfig[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Get limits of an existing route in the Gateway.

Warning

This API is only available when connected to a Databricks-hosted AI Gateway.

Parameters

route – The name of the route to get limits of.

Example usage from within Databricks:

from mlflow.gateway import set_gateway_uri, get_limits

set_gateway_uri(gateway_uri="databricks")

get_limits("my-new-route")
mlflow.gateway.get_route(name: str)mlflow.gateway.config.Route[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Retrieves a specific route from the MLflow Gateway service.

This function creates an instance of MlflowGatewayClient and uses it to fetch a route by its name from the Gateway service.

Parameters

name – The name of the route to fetch.

Returns

An instance of the Route class representing the fetched route.

mlflow.gateway.query(route: str, data)[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Issues a query request to a configured service through a named route on the Gateway Server. This function will interface with a configured route name (examples below) and return the response from the provider in a standardized format.

Parameters
  • route – The name of the configured route. Route names can be obtained by running mlflow.gateway.search_routes()

  • data – The request payload to be submitted to the route. The exact configuration of the expected structure varies based on the route configuration.

Returns

The response from the configured route endpoint provider in a standardized format.

Chat example:

from mlflow.gateway import query, set_gateway_uri

set_gateway_uri(gateway_uri="http://my.gateway:9000")
response = query(
    "my_chat_route",
    {"messages": [{"role": "user", "content": "What is the best day of the week?"}]},
)

Completions example:

from mlflow.gateway import query, set_gateway_uri

set_gateway_uri(gateway_uri="http://my.gateway:9000")
response = query("a_completions_route", {"prompt": "Where do we go from"})

Embeddings example:

from mlflow.gateway import query, set_gateway_uri

set_gateway_uri(gateway_uri="http://my.gateway:9000")
response = query(
    "embeddings_route", {"text": ["I like spaghetti", "and sushi", "but not together"]}
)

Additional parameters that are valid for a given provider and route configuration can be included with the request as shown below, using an openai completions route request as an example:

from mlflow.gateway import query, set_gateway_uri

set_gateway_uri(gateway_uri="http://my.gateway:9000")
response = query(
    "a_completions_route",
    {
        "prompt": "Give me an example of a properly formatted pytest unit test",
        "temperature": 0.6,
        "max_tokens": 1000,
    },
)
mlflow.gateway.search_routes()List[mlflow.gateway.config.Route][source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Searches for routes in the MLflow Gateway service.

This function creates an instance of MlflowGatewayClient and uses it to fetch a list of routes from the Gateway service.

Returns

A list of Route instances.

mlflow.gateway.set_gateway_uri(gateway_uri: str)[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Sets the uri of a configured and running MLflow AI Gateway server in a global context.

Providing a valid uri and calling this function is required in order to use the MLflow AI Gateway fluent APIs.

Args:
gateway_uri: The full uri of a running MLflow AI Gateway server or, if running on

Databricks, “databricks”.

mlflow.gateway.set_limits(route: str, limits: List[Dict[str, Any]])mlflow.gateway.config.LimitsConfig[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Set limits on an existing route in the Gateway.

Warning

This API is only available when running within Databricks.

Parameters
  • route – The name of the route to set limits on.

  • limits – Limits to set on the route.

Example usage from within Databricks:

from mlflow.gateway import set_gateway_uri, set_limits

set_gateway_uri(gateway_uri="databricks")

set_limits("my-new-route", [{"key": "user", "renewal_period": "minute", "calls": 50}])
class mlflow.gateway.base_models.ConfigModel[source]

A pydantic model representing Gateway configuration data, such as an OpenAI completions route definition including route name, model name, API keys, etc.

class mlflow.gateway.client.MlflowGatewayClient(gateway_uri: Optional[str] = None)[source]

Client for interacting with the MLflow Gateway API.

Parameters

gateway_uri – Optional URI of the gateway. If not provided, attempts to resolve from first the stored result of set_gateway_uri(), then the environment variable MLFLOW_GATEWAY_URI.

create_route(name: str, route_type: Optional[str] = None, model: Optional[Dict[str, Any]] = None)mlflow.gateway.config.Route[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Create a new route in the Gateway.

Warning

This API is only available when running within Databricks. When running elsewhere, route configuration is handled via updates to the route configuration YAML file that is specified during Gateway server start.

Parameters
  • name – The name of the route. This parameter is required for all routes.

  • route_type – The type of the route (e.g., ‘llm/v1/chat’, ‘llm/v1/completions’, ‘llm/v1/embeddings’). This parameter is required for routes that are not managed by Databricks (the provider isn’t ‘databricks’).

  • model

    A dictionary representing the model details to be associated with the route. This parameter is required for all routes. This dictionary should define:

    • The model name (e.g., “gpt-4o-mini”)

    • The provider (e.g., “openai”, “anthropic”)

    • The configuration for the model used in the route

Returns

A serialized representation of the Route data structure, providing information about the name, type, and model details for the newly created route endpoint.

Raises

mlflow.MlflowException – If the function is not running within Databricks.

Note

See the official Databricks documentation for MLflow Gateway for examples of supported model configurations and how to dynamically create new routes within Databricks.

Example usage from within Databricks:

from mlflow.gateway import MlflowGatewayClient

gateway_client = MlflowGatewayClient("databricks")

openai_api_key = ...

new_route = gateway_client.create_route(
    name="my-route",
    route_type="llm/v1/completions",
    model={
        "name": "question-answering-bot",
        "provider": "openai",
        "openai_config": {
            "openai_api_key": openai_api_key,
        },
    },
)
delete_route(name: str)None[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Delete an existing route in the Gateway.

Warning

This API is only available when running within Databricks. When running elsewhere, route deletion is handled by removing the corresponding entry from the route configuration YAML file that is specified during Gateway server start.

Parameters

name – The name of the route to delete.

Raises

mlflow.MlflowException – If the function is not running within Databricks.

Example usage from within Databricks:

from mlflow.gateway import MlflowGatewayClient

gateway_client = MlflowGatewayClient("databricks")
gateway_client.delete_route("my-existing-route")
property gateway_uri

Get the current value for the URI of the MLflow Gateway.

Returns

The gateway URI.

get_limits(route: str)mlflow.gateway.config.LimitsConfig[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Get limits of an existing route in the Gateway.

Warning

This API is only available when connected to a Databricks-hosted AI Gateway.

Parameters

route – The name of the route to get limits of.

Returns

The returned data structure is a serialized representation of the Limit data structure, giving information about the renewal_period, key, and calls.

Example usage:

from mlflow.gateway import MlflowGatewayClient

gateway_client = MlflowGatewayClient("databricks")

gateway_client.get_limits("my-new-route")
get_route(name: str)[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Get a specific query route from the gateway. The routes that are available to retrieve are only those that have been configured through the MLflow Gateway Server configuration file (set during server start or through server update commands).

Parameters

name – The name of the route.

Returns

The returned data structure is a serialized representation of the Route data structure, giving information about the name, type, and model details (model name and provider) for the requested route endpoint.

query(route: str, data: Dict[str, Any])[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Submit a query to a configured provider route.

Parameters
  • route – The name of the route to submit the query to.

  • data

    The data to send in the query. A dictionary representing the per-route specific structure required for a given provider.

    For chat, the structure should be:

    from mlflow.gateway import MlflowGatewayClient
    
    gateway_client = MlflowGatewayClient("http://my.gateway:8888")
    
    response = gateway_client.query(
        "my-chat-route",
        {
            "messages": [
                {"role": "user", "content": "Tell me a joke about rabbits"},
            ]
        },
    )
    

    For completions, the structure should be:

    from mlflow.gateway import MlflowGatewayClient
    
    gateway_client = MlflowGatewayClient("http://my.gateway:8888")
    
    response = gateway_client.query(
        "my-completions-route", {"prompt": "It's one small step for"}
    )
    

    For embeddings, the structure should be:

    from mlflow.gateway import MlflowGatewayClient
    
    gateway_client = MlflowGatewayClient("http://my.gateway:8888")
    
    response = gateway_client.query(
        "my-embeddings-route",
        {"text": ["It was the best of times", "It was the worst of times"]},
    )
    

    Additional parameters that are valid for a given provider and route configuration can be included with the request as shown below, using an openai completions route request as an example:

    from mlflow.gateway import MlflowGatewayClient
    
    gateway_client = MlflowGatewayClient("http://my.gateway:8888")
    
    response = gateway_client.query(
        "my-completions-route",
        {
            "prompt": "Give me an example of a properly formatted pytest unit test",
            "temperature": 0.3,
            "max_tokens": 500,
        },
    )
    

Returns

The route’s response as a dictionary, standardized to the route type.

search_routes(page_token: Optional[str] = None)PagedList[mlflow.gateway.config.Route][source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Search for routes in the Gateway.

Parameters

page_token – Token specifying the next page of results. It should be obtained from a prior search_routes() call.

Returns

Returns a list of all configured and initialized Route data for the MLflow Gateway Server. The return will be a list of dictionaries that detail the name, type, and model details of each active route endpoint.

set_limits(route: str, limits: List[Dict[str, Any]])mlflow.gateway.config.LimitsConfig[source]

Warning

MLflow AI gateway is deprecated and has been replaced by the deployments API for generative AI. See https://mlflow.org/docs/latest/llms/gateway/migration.html for migration.

Set limits on an existing route in the Gateway.

Warning

This API is only available when running within Databricks.

Parameters
  • route – The name of the route to set limits on.

  • limits

    Limits (Array of dictionary) to set on the route. Each limit is defined by a dictionary representing the limit details to be associated with the route. This dictionary should define:

    • renewal_period: a string representing the length of the window to enforce limit on (only supports “minute” for now).

    • calls: a non-negative integer representing the number of calls allowed per renewal_period (e.g., 10, 0, 55).

    • key: an optional string represents per route limit or per user limit (“user” for per user limit, “route” for per route limit, if not supplied, default to per route limit).

Returns

The returned data structure is a serialized representation of the Limit data structure, giving information about the renewal_period, key, and calls.

Example usage:

from mlflow.gateway import MlflowGatewayClient

gateway_client = MlflowGatewayClient("databricks")

gateway_client.set_limits(
    "my-new-route", [{"key": "user", "renewal_period": "minute", "calls": 50}]
)
class mlflow.gateway.config.AI21LabsConfig(*, ai21labs_api_key: str)[source]
ai21labs_api_key: str
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'ai21labs_api_key': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

classmethod validate_ai21labs_api_key(value)[source]
class mlflow.gateway.config.AWSBaseConfig(*, aws_region: Optional[str] = None)[source]
aws_region: Optional[str]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'aws_region': FieldInfo(annotation=Union[str, NoneType], required=False, default=None)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

class mlflow.gateway.config.AWSIdAndKey(*, aws_region: Optional[str] = None, aws_access_key_id: str, aws_secret_access_key: str, aws_session_token: Optional[str] = None)[source]
aws_access_key_id: str
aws_secret_access_key: str
aws_session_token: Optional[str]
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'aws_access_key_id': FieldInfo(annotation=str, required=True), 'aws_region': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'aws_secret_access_key': FieldInfo(annotation=str, required=True), 'aws_session_token': FieldInfo(annotation=Union[str, NoneType], required=False, default=None)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

class mlflow.gateway.config.AWSRole(*, aws_region: Optional[str] = None, aws_role_arn: str, session_length_seconds: int = 900)[source]
aws_role_arn: str
model_config: ClassVar[ConfigDict] = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'aws_region': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'aws_role_arn': FieldInfo(annotation=str, required=True), 'session_length_seconds': FieldInfo(annotation=int, required=False, default=900)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

session_length_seconds: int
class mlflow.gateway.config.AliasedConfigModel[source]

Enables use of field aliases in a configuration model for backwards compatibility

model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'populate_by_name': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

class mlflow.gateway.config.AmazonBedrockConfig(*, aws_config: Union[mlflow.gateway.config.AWSRole, mlflow.gateway.config.AWSIdAndKey, mlflow.gateway.config.AWSBaseConfig])[source]
aws_config: Union[mlflow.gateway.config.AWSRole, mlflow.gateway.config.AWSIdAndKey, mlflow.gateway.config.AWSBaseConfig]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'aws_config': FieldInfo(annotation=Union[AWSRole, AWSIdAndKey, AWSBaseConfig], required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

class mlflow.gateway.config.AnthropicConfig(*, anthropic_api_key: str, anthropic_version: str = '2023-06-01')[source]
anthropic_api_key: str
anthropic_version: str
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'anthropic_api_key': FieldInfo(annotation=str, required=True), 'anthropic_version': FieldInfo(annotation=str, required=False, default='2023-06-01')}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

classmethod validate_anthropic_api_key(value)[source]
class mlflow.gateway.config.CohereConfig(*, cohere_api_key: str)[source]
cohere_api_key: str
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'cohere_api_key': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

classmethod validate_cohere_api_key(value)[source]
class mlflow.gateway.config.GatewayConfig(*, endpoints: List[mlflow.gateway.config.RouteConfig])[source]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'populate_by_name': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'routes': FieldInfo(annotation=List[mlflow.gateway.config.RouteConfig], required=True, alias='endpoints', alias_priority=2)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

routes: List[mlflow.gateway.config.RouteConfig]
class mlflow.gateway.config.HuggingFaceTextGenerationInferenceConfig(*, hf_server_url: str)[source]
hf_server_url: str
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'hf_server_url': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

class mlflow.gateway.config.Limit(*, calls: int, key: Optional[str] = None, renewal_period: str)[source]
calls: int
key: Optional[str]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'calls': FieldInfo(annotation=int, required=True), 'key': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'renewal_period': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

renewal_period: str
class mlflow.gateway.config.LimitsConfig(*, limits: Optional[List[mlflow.gateway.config.Limit]] = [])[source]
limits: Optional[List[mlflow.gateway.config.Limit]]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'limits': FieldInfo(annotation=Union[List[mlflow.gateway.config.Limit], NoneType], required=False, default=[])}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

class mlflow.gateway.config.MistralConfig(*, mistral_api_key: str)[source]
mistral_api_key: str
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'mistral_api_key': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

classmethod validate_mistral_api_key(value)[source]
class mlflow.gateway.config.MlflowModelServingConfig(*, model_server_url: str)[source]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'protected_namespaces': ()}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'model_server_url': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

model_server_url: str
class mlflow.gateway.config.Model(*, name: Optional[str] = None, provider: Union[str, mlflow.gateway.config.Provider], config: Optional[mlflow.gateway.base_models.ConfigModel[mlflow.gateway.base_models.ConfigModel]] = None)[source]
config: Optional[mlflow.gateway.base_models.ConfigModel[mlflow.gateway.base_models.ConfigModel]]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'config': FieldInfo(annotation=Union[Annotated[mlflow.gateway.base_models.ConfigModel, SerializeAsAny()], NoneType], required=False, default=None), 'name': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'provider': FieldInfo(annotation=Union[str, Provider], required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

name: Optional[str]
provider: Union[str, mlflow.gateway.config.Provider]
classmethod validate_config(info, values)[source]
classmethod validate_provider(value)[source]
class mlflow.gateway.config.ModelInfo(*, name: Optional[str] = None, provider: mlflow.gateway.config.Provider)[source]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'name': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'provider': FieldInfo(annotation=Provider, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

name: Optional[str]
provider: mlflow.gateway.config.Provider
class mlflow.gateway.config.MosaicMLConfig(*, mosaicml_api_key: str, mosaicml_api_base: Optional[str] = None)[source]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'mosaicml_api_base': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'mosaicml_api_key': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

mosaicml_api_base: Optional[str]
mosaicml_api_key: str
classmethod validate_mosaicml_api_key(value)[source]
class mlflow.gateway.config.OpenAIAPIType(value)[source]

An enumeration.

AZURE = 'azure'
AZUREAD = 'azuread'
OPENAI = 'openai'
class mlflow.gateway.config.OpenAIConfig(*, openai_api_key: str, openai_api_type: mlflow.gateway.config.OpenAIAPIType = <OpenAIAPIType.OPENAI: 'openai'>, openai_api_base: Optional[str] = None, openai_api_version: Optional[str] = None, openai_deployment_name: Optional[str] = None, openai_organization: Optional[str] = None)[source]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'openai_api_base': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'openai_api_key': FieldInfo(annotation=str, required=True), 'openai_api_type': FieldInfo(annotation=OpenAIAPIType, required=False, default=<OpenAIAPIType.OPENAI: 'openai'>), 'openai_api_version': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'openai_deployment_name': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'openai_organization': FieldInfo(annotation=Union[str, NoneType], required=False, default=None)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

openai_api_base: Optional[str]
openai_api_key: str
openai_api_type: mlflow.gateway.config.OpenAIAPIType
openai_api_version: Optional[str]
openai_deployment_name: Optional[str]
openai_organization: Optional[str]
classmethod validate_field_compatibility(info: Dict[str, Any])[source]
classmethod validate_openai_api_key(value)[source]
class mlflow.gateway.config.PaLMConfig(*, palm_api_key: str)[source]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'palm_api_key': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

palm_api_key: str
classmethod validate_palm_api_key(value)[source]
class mlflow.gateway.config.Provider(value)[source]

An enumeration.

AI21LABS = 'ai21labs'
AMAZON_BEDROCK = 'amazon-bedrock'
ANTHROPIC = 'anthropic'
BEDROCK = 'bedrock'
COHERE = 'cohere'
DATABRICKS = 'databricks'
DATABRICKS_MODEL_SERVING = 'databricks-model-serving'
HUGGINGFACE_TEXT_GENERATION_INFERENCE = 'huggingface-text-generation-inference'
MISTRAL = 'mistral'
MLFLOW_MODEL_SERVING = 'mlflow-model-serving'
MOSAICML = 'mosaicml'
OPENAI = 'openai'
PALM = 'palm'
TOGETHERAI = 'togetherai'
classmethod values()[source]
class mlflow.gateway.config.Route(*, name: str, route_type: str, model: mlflow.gateway.config.RouteModelInfo, route_url: str, limit: Optional[mlflow.gateway.config.Limit] = None)[source]
class Config[source]
json_schema_extra = {'example': {'model': {'name': 'gpt-4o-mini', 'provider': 'openai'}, 'name': 'openai-completions', 'route_type': 'llm/v1/completions', 'route_url': '/gateway/routes/completions/invocations'}}
limit: Optional[mlflow.gateway.config.Limit]
model: mlflow.gateway.config.RouteModelInfo
model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'json_schema_extra': {'example': {'model': {'name': 'gpt-4o-mini', 'provider': 'openai'}, 'name': 'openai-completions', 'route_type': 'llm/v1/completions', 'route_url': '/gateway/routes/completions/invocations'}}}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'limit': FieldInfo(annotation=Union[Limit, NoneType], required=False, default=None), 'model': FieldInfo(annotation=RouteModelInfo, required=True), 'name': FieldInfo(annotation=str, required=True), 'route_type': FieldInfo(annotation=str, required=True), 'route_url': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

name: str
route_type: str
route_url: str
to_endpoint()[source]
class mlflow.gateway.config.RouteConfig(*, name: str, endpoint_type: mlflow.gateway.config.RouteType, model: mlflow.gateway.config.Model, limit: Optional[mlflow.gateway.config.Limit] = None)[source]
limit: Optional[mlflow.gateway.config.Limit]
model: mlflow.gateway.config.Model
model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'populate_by_name': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'limit': FieldInfo(annotation=Union[Limit, NoneType], required=False, default=None), 'model': FieldInfo(annotation=Model, required=True), 'name': FieldInfo(annotation=str, required=True), 'route_type': FieldInfo(annotation=RouteType, required=True, alias='endpoint_type', alias_priority=2)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

name: str
route_type: mlflow.gateway.config.RouteType
to_route()mlflow.gateway.config.Route[source]
classmethod validate_endpoint_name(route_name)[source]
classmethod validate_limit(value)[source]
classmethod validate_model(model)[source]
classmethod validate_route_type(value)[source]
classmethod validate_route_type_and_model_name(values)[source]
class mlflow.gateway.config.RouteModelInfo(*, name: Optional[str] = None, provider: str)[source]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'name': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'provider': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

name: Optional[str]
provider: str
class mlflow.gateway.config.RouteType(value)[source]

An enumeration.

LLM_V1_CHAT = 'llm/v1/chat'
LLM_V1_COMPLETIONS = 'llm/v1/completions'
LLM_V1_EMBEDDINGS = 'llm/v1/embeddings'
class mlflow.gateway.config.TogetherAIConfig(*, togetherai_api_key: str)[source]
model_config: ClassVar[ConfigDict] = {'extra': 'ignore'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'togetherai_api_key': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

togetherai_api_key: str
classmethod validate_togetherai_api_key(value)[source]