mlflow.sagemaker

The mlflow.sagemaker module provides an API for deploying MLflow models to Amazon SageMaker.

class mlflow.sagemaker.SageMakerDeploymentClient(target_uri)[source]

Bases: mlflow.deployments.base.BaseDeploymentClient

Initialize a deployment client for SageMaker. The default region and assumed role ARN will be set according to the value of the target_uri.

This class is meant to supercede the other mlflow.sagemaker real-time serving API’s. It is also designed to be used through the mlflow.deployments module. This means that you can deploy to SageMaker using the mlflow deployments CLI and get a client through the mlflow.deployments.get_deploy_client function.

Parameters

target_uri

A URI that follows one of the following formats:

  • sagemaker: This will set the default region to us-west-2 and the default assumed role ARN to None.

  • sagemaker:/region_name: This will set the default region to region_name and the default assumed role ARN to None.

  • sagemaker:/region_name/assumed_role_arn: This will set the default region to region_name and the default assumed role ARN to assumed_role_arn.

When an assumed_role_arn is provided without a region_name, an MlflowException will be raised.

create_deployment(name, model_uri, flavor=None, config=None, endpoint=None)[source]

Deploy an MLflow model on AWS SageMaker. The currently active AWS account must have correct permissions set up.

This function creates a SageMaker endpoint. For more information about the input data formats accepted by this endpoint, see the MLflow deployment tools documentation.

Parameters
  • name – Name of the deployed application.

  • model_uri

    The location, in URI format, of the MLflow model to deploy to SageMaker. For example:

    • /Users/me/path/to/local/model

    • relative/path/to/local/model

    • s3://my_bucket/path/to/model

    • runs:/<mlflow_run_id>/run-relative/path/to/model

    • models:/<model_name>/<model_version>

    • models:/<model_name>/<stage>

    For more information about supported URI schemes, see Referencing Artifacts.

  • flavor – The name of the flavor of the model to use for deployment. Must be either None or one of mlflow.sagemaker.SUPPORTED_DEPLOYMENT_FLAVORS. If None, a flavor is automatically selected from the model’s available flavors. If the specified flavor is not present or not supported for deployment, an exception will be thrown.

  • config

    Configuration parameters. The supported parameters are:

    • assume_role_arn: The name of an IAM cross-account role to be assumed to deploy SageMaker to another AWS account. If this parameter is not specified, the role given in the target_uri will be used. If the role is not given in the target_uri, defaults to us-west-2.

    • execution_role_arn: The name of an IAM role granting the SageMaker service permissions to access the specified Docker image and S3 bucket containing MLflow model artifacts. If unspecified, the currently-assumed role will be used. This execution role is passed to the SageMaker service when creating a SageMaker model from the specified MLflow model. It is passed as the ExecutionRoleArn parameter of the SageMaker CreateModel API call. This role is not assumed for any other call. For more information about SageMaker execution roles for model creation, see https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html.

    • bucket: S3 bucket where model artifacts will be stored. Defaults to a SageMaker-compatible bucket name.

    • image_url: URL of the ECR-hosted Docker image the model should be deployed into, produced by mlflow sagemaker build-and-push-container. This parameter can also be specified by the environment variable MLFLOW_SAGEMAKER_DEPLOY_IMG_URL.

    • region_name: Name of the AWS region to which to deploy the application. If unspecified, use the region name given in the target_uri. If it is also not specified in the target_uri, defaults to us-west-2.

    • archive: If True, any pre-existing SageMaker application resources that become inactive (i.e. as a result of deploying in mlflow.sagemaker.DEPLOYMENT_MODE_REPLACE mode) are preserved. These resources may include unused SageMaker models and endpoint configurations that were associated with a prior version of the application endpoint. If False, these resources are deleted. In order to use archive=False, create_deployment() must be executed synchronously with synchronous=True. Defaults to False.

    • instance_type: The type of SageMaker ML instance on which to deploy the model. For a list of supported instance types, see https://aws.amazon.com/sagemaker/pricing/instance-types/. Defaults to ml.m4.xlarge.

    • instance_count: The number of SageMaker ML instances on which to deploy the model. Defaults to 1.

    • synchronous: If True, this function will block until the deployment process succeeds or encounters an irrecoverable failure. If False, this function will return immediately after starting the deployment process. It will not wait for the deployment process to complete; in this case, the caller is responsible for monitoring the health and status of the pending deployment via native SageMaker APIs or the AWS console. Defaults to True.

    • timeout_seconds: If synchronous is True, the deployment process will return after the specified number of seconds if no definitive result (success or failure) is achieved. Once the function returns, the caller is responsible for monitoring the health and status of the pending deployment using native SageMaker APIs or the AWS console. If synchronous is False, this parameter is ignored. Defaults to 300.

    • vpc_config: A dictionary specifying the VPC configuration to use when creating the new SageMaker model associated with this application. The acceptable values for this parameter are identical to those of the VpcConfig parameter in the SageMaker boto3 client’s create_model method. For more information, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_VpcConfig.html. Defaults to None.

    • data_capture_config: A dictionary specifying the data capture configuration to use when creating the new SageMaker model associated with this application. For more information, see https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DataCaptureConfig.html. Defaults to None.

    • variant_name: A string specifying the desired name when creating a production variant. Defaults to None.

    • async_inference_config: A dictionary specifying the async_inference_configuration

    • serverless_config: A dictionary specifying the serverless_configuration

    • env: A dictionary specifying environment variables as key-value pairs to be set for the deployed model. Defaults to None.

    • tags: A dictionary of key-value pairs representing additional tags to be set for the deployed model. Defaults to None.

  • endpoint – (optional) Endpoint to create the deployment under. Currently unsupported

Python example
from mlflow.deployments import get_deploy_client

vpc_config = {
    "SecurityGroupIds": [
        "sg-123456abc",
    ],
    "Subnets": [
        "subnet-123456abc",
    ],
}
config = dict(
    assume_role_arn="arn:aws:123:role/assumed_role",
    execution_role_arn="arn:aws:456:role/execution_role",
    bucket_name="my-s3-bucket",
    image_url="1234.dkr.ecr.us-east-1.amazonaws.com/mlflow-test:1.23.1",
    region_name="us-east-1",
    archive=False,
    instance_type="ml.m5.4xlarge",
    instance_count=1,
    synchronous=True,
    timeout_seconds=300,
    vpc_config=vpc_config,
    variant_name="prod-variant-1",
    env={"DISABLE_NGINX": "true", "GUNICORN_CMD_ARGS": '"--timeout 60"'},
    tags={"training_timestamp": "2022-11-01T05:12:26"},
)
client = get_deploy_client("sagemaker")
client.create_deployment(
    "my-deployment",
    model_uri="/mlruns/0/abc/model",
    flavor="python_function",
    config=config,
)
Command-line example
mlflow deployments create --target sagemaker:/us-east-1/arn:aws:123:role/assumed_role \
        --name my-deployment \
        --model-uri /mlruns/0/abc/model \
        --flavor python_function\
        -C execution_role_arn=arn:aws:456:role/execution_role \
        -C bucket_name=my-s3-bucket \
        -C image_url=1234.dkr.ecr.us-east-1.amazonaws.com/mlflow-test:1.23.1 \
        -C region_name=us-east-1 \
        -C archive=False \
        -C instance_type=ml.m5.4xlarge \
        -C instance_count=1 \
        -C synchronous=True \
        -C timeout_seconds=300 \
        -C variant_name=prod-variant-1 \
        -C vpc_config='{"SecurityGroupIds": ["sg-123456abc"], \
        "Subnets": ["subnet-123456abc"]}' \
        -C data_capture_config='{"EnableCapture": True, \
        'InitalSamplingPercentage': 100, 'DestinationS3Uri": 's3://my-bucket/path', \
        'CaptureOptions': [{'CaptureMode': 'Output'}]}'
        -C env='{"DISABLE_NGINX": "true", "GUNICORN_CMD_ARGS": ""--timeout 60""}' \
        -C tags='{"training_timestamp": "2022-11-01T05:12:26"}' \
create_endpoint(name, config=None)[source]

Create an endpoint with the specified target. By default, this method should block until creation completes (i.e. until it’s possible to create a deployment within the endpoint). In the case of conflicts (e.g. if it’s not possible to create the specified endpoint due to conflict with an existing endpoint), raises a mlflow.exceptions.MlflowException. See target-specific plugin documentation for additional detail on support for asynchronous creation and other configuration.

Parameters
  • name – Unique name to use for endpoint. If another endpoint exists with the same name, raises a mlflow.exceptions.MlflowException.

  • config – (optional) Dict containing target-specific configuration for the endpoint.

Returns

Dict corresponding to created endpoint, which must contain the ‘name’ key.

delete_deployment(name, config=None, endpoint=None)[source]

Delete a SageMaker application.

Parameters
  • name – Name of the deployed application.

  • config

    Configuration parameters. The supported parameters are:

    • assume_role_arn: The name of an IAM role to be assumed to delete the SageMaker deployment.

    • region_name: Name of the AWS region in which the application is deployed. Defaults to us-west-2 or the region provided in the target_uri.

    • archive: If True, resources associated with the specified application, such as its associated models and endpoint configuration, are preserved. If False, these resources are deleted. In order to use archive=False, delete() must be executed synchronously with synchronous=True. Defaults to False.

    • synchronous: If True, this function blocks until the deletion process succeeds or encounters an irrecoverable failure. If False, this function returns immediately after starting the deletion process. It will not wait for the deletion process to complete; in this case, the caller is responsible for monitoring the status of the deletion process via native SageMaker APIs or the AWS console. Defaults to True.

    • timeout_seconds: If synchronous is True, the deletion process returns after the specified number of seconds if no definitive result (success or failure) is achieved. Once the function returns, the caller is responsible for monitoring the status of the deletion process via native SageMaker APIs or the AWS console. If synchronous is False, this parameter is ignored. Defaults to 300.

  • endpoint – (optional) Endpoint containing the deployment to delete. Currently unsupported

Python example
from mlflow.deployments import get_deploy_client

config = dict(
    assume_role_arn="arn:aws:123:role/assumed_role",
    region_name="us-east-1",
    archive=False,
    synchronous=True,
    timeout_seconds=300,
)
client = get_deploy_client("sagemaker")
client.delete_deployment("my-deployment", config=config)
Command-line example
mlflow deployments delete --target sagemaker \
        --name my-deployment \
        -C assume_role_arn=arn:aws:123:role/assumed_role \
        -C region_name=us-east-1 \
        -C archive=False \
        -C synchronous=True \
        -C timeout_seconds=300
delete_endpoint(endpoint)[source]

Delete the endpoint from the specified target. Deletion should be idempotent (i.e. deletion should not fail if retried on a non-existent deployment).

Parameters

endpoint – Name of endpoint to delete

explain(deployment_name=None, df=None, endpoint=None)[source]

This function has not been implemented and will be coming in the future.

get_deployment(name, endpoint=None)[source]

Returns a dictionary describing the specified deployment.

If a region name needs to be specified, the plugin must be initialized with the AWS region in the target_uri such as sagemaker:/us-east-1.

To assume an IAM role, the plugin must be initialized with the AWS region and the role ARN in the target_uri such as sagemaker:/us-east-1/arn:aws:1234:role/assumed_role.

A mlflow.exceptions.MlflowException will also be thrown when an error occurs while retrieving the deployment.

Parameters
  • name – Name of deployment to retrieve

  • endpoint – (optional) Endpoint containing the deployment to get. Currently unsupported

Returns

A dictionary that describes the specified deployment

Python example
from mlflow.deployments import get_deploy_client

client = get_deploy_client("sagemaker:/us-east-1/arn:aws:123:role/assumed_role")
client.get_deployment("my-deployment")
Command-line example
mlflow deployments get --target sagemaker:/us-east-1/arn:aws:1234:role/assumed_role \
    --name my-deployment
get_endpoint(endpoint)[source]

Returns a dictionary describing the specified endpoint, throwing a py:class:mlflow.exception.MlflowException if no endpoint exists with the provided name. The dict is guaranteed to contain an ‘name’ key containing the endpoint name. The other fields of the returned dictionary and their types may vary across targets.

Parameters

endpoint – Name of endpoint to fetch

list_deployments(endpoint=None)[source]

List deployments. This method returns a list of dictionaries that describes each deployment.

If a region name needs to be specified, the plugin must be initialized with the AWS region in the target_uri such as sagemaker:/us-east-1.

To assume an IAM role, the plugin must be initialized with the AWS region and the role ARN in the target_uri such as sagemaker:/us-east-1/arn:aws:1234:role/assumed_role.

Parameters

endpoint – (optional) List deployments in the specified endpoint. Currently unsupported

Returns

A list of dictionaries corresponding to deployments.

Python example
from mlflow.deployments import get_deploy_client

client = get_deploy_client("sagemaker:/us-east-1/arn:aws:123:role/assumed_role")
client.list_deployments()
Command-line example
mlflow deployments list --target sagemaker:/us-east-1/arn:aws:1234:role/assumed_role
list_endpoints()[source]

List endpoints in the specified target. This method is expected to return an unpaginated list of all endpoints (an alternative would be to return a dict with an ‘endpoints’ field containing the actual endpoints, with plugins able to specify other fields, e.g. a next_page_token field, in the returned dictionary for pagination, and to accept a pagination_args argument to this method for passing pagination-related args).

Returns

A list of dicts corresponding to endpoints. Each dict is guaranteed to contain a ‘name’ key containing the endpoint name. The other fields of the returned dictionary and their types may vary across targets.

predict(deployment_name=None, inputs=None, endpoint=None, params: Optional[Dict[str, Any]] = None)[source]

Compute predictions from the specified deployment using the provided PyFunc input.

The input/output types of this method match the MLflow PyFunc prediction interface.

If a region name needs to be specified, the plugin must be initialized with the AWS region in the target_uri such as sagemaker:/us-east-1.

To assume an IAM role, the plugin must be initialized with the AWS region and the role ARN in the target_uri such as sagemaker:/us-east-1/arn:aws:1234:role/assumed_role.

Parameters
  • deployment_name – Name of the deployment to predict against.

  • inputs – Input data (or arguments) to pass to the deployment or model endpoint for inference. For a complete list of supported input types, see Inference API.

  • endpoint – Endpoint to predict against. Currently unsupported

  • params – Optional parameters to invoke the endpoint with.

Returns

A PyFunc output, such as a Pandas DataFrame, Pandas Series, or NumPy array. For a complete list of supported output types, see Inference API.

Python example
import pandas as pd
from mlflow.deployments import get_deploy_client

df = pd.DataFrame(data=[[1, 2, 3]], columns=["feat1", "feat2", "feat3"])
client = get_deploy_client("sagemaker:/us-east-1/arn:aws:123:role/assumed_role")
client.predict("my-deployment", df)
Command-line example
cat > ./input.json <<- input
{"feat1": {"0": 1}, "feat2": {"0": 2}, "feat3": {"0": 3}}
input

mlflow deployments predict \
    --target sagemaker:/us-east-1/arn:aws:1234:role/assumed_role \
    --name my-deployment \
    --input-path ./input.json
update_deployment(name, model_uri, flavor=None, config=None, endpoint=None)[source]

Update a deployment on AWS SageMaker. This function can replace or add a new model to an existing SageMaker endpoint. By default, this function replaces the existing model with the new one. The currently active AWS account must have correct permissions set up.

Parameters
  • name – Name of the deployed application.

  • model_uri

    The location, in URI format, of the MLflow model to deploy to SageMaker. For example:

    • /Users/me/path/to/local/model

    • relative/path/to/local/model

    • s3://my_bucket/path/to/model

    • runs:/<mlflow_run_id>/run-relative/path/to/model

    • models:/<model_name>/<model_version>

    • models:/<model_name>/<stage>

    For more information about supported URI schemes, see Referencing Artifacts.

  • flavor – The name of the flavor of the model to use for deployment. Must be either None or one of mlflow.sagemaker.SUPPORTED_DEPLOYMENT_FLAVORS. If None, a flavor is automatically selected from the model’s available flavors. If the specified flavor is not present or not supported for deployment, an exception will be thrown.

  • config

    Configuration parameters. The supported parameters are:

    • assume_role_arn: The name of an IAM cross-account role to be assumed to deploy SageMaker to another AWS account. If this parameter is not specified, the role given in the target_uri will be used. If the role is not given in the target_uri, defaults to us-west-2.

    • execution_role_arn: The name of an IAM role granting the SageMaker service permissions to access the specified Docker image and S3 bucket containing MLflow model artifacts. If unspecified, the currently-assumed role will be used. This execution role is passed to the SageMaker service when creating a SageMaker model from the specified MLflow model. It is passed as the ExecutionRoleArn parameter of the SageMaker CreateModel API call. This role is not assumed for any other call. For more information about SageMaker execution roles for model creation, see https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html.

    • bucket: S3 bucket where model artifacts will be stored. Defaults to a SageMaker-compatible bucket name.

    • image_url: URL of the ECR-hosted Docker image the model should be deployed into, produced by mlflow sagemaker build-and-push-container. This parameter can also be specified by the environment variable MLFLOW_SAGEMAKER_DEPLOY_IMG_URL.

    • region_name: Name of the AWS region to which to deploy the application. If unspecified, use the region name given in the target_uri. If it is also not specified in the target_uri, defaults to us-west-2.

    • mode: The mode in which to deploy the application. Must be one of the following:

      mlflow.sagemaker.DEPLOYMENT_MODE_REPLACE

      If an application of the specified name exists, its model(s) is replaced with the specified model. If no such application exists, it is created with the specified name and model. This is the default mode.

      mlflow.sagemaker.DEPLOYMENT_MODE_ADD

      Add the specified model to a pre-existing application with the specified name, if one exists. If the application does not exist, a new application is created with the specified name and model. NOTE: If the application already exists, the specified model is added to the application’s corresponding SageMaker endpoint with an initial weight of zero (0). To route traffic to the model, update the application’s associated endpoint configuration using either the AWS console or the UpdateEndpointWeightsAndCapacities function defined in https://docs.aws.amazon.com/sagemaker/latest/dg/API_UpdateEndpointWeightsAndCapacities.html.

    • archive: If True, any pre-existing SageMaker application resources that become inactive (i.e. as a result of deploying in mlflow.sagemaker.DEPLOYMENT_MODE_REPLACE mode) are preserved. These resources may include unused SageMaker models and endpoint configurations that were associated with a prior version of the application endpoint. If False, these resources are deleted. In order to use archive=False, update_deployment() must be executed synchronously with synchronous=True. Defaults to False.

    • instance_type: The type of SageMaker ML instance on which to deploy the model. For a list of supported instance types, see https://aws.amazon.com/sagemaker/pricing/instance-types/. Defaults to ml.m4.xlarge.

    • instance_count: The number of SageMaker ML instances on which to deploy the model. Defaults to 1.

    • synchronous: If True, this function will block until the deployment process succeeds or encounters an irrecoverable failure. If False, this function will return immediately after starting the deployment process. It will not wait for the deployment process to complete; in this case, the caller is responsible for monitoring the health and status of the pending deployment via native SageMaker APIs or the AWS console. Defaults to True.

    • timeout_seconds: If synchronous is True, the deployment process will return after the specified number of seconds if no definitive result (success or failure) is achieved. Once the function returns, the caller is responsible for monitoring the health and status of the pending deployment using native SageMaker APIs or the AWS console. If synchronous is False, this parameter is ignored. Defaults to 300.

    • variant_name: A string specifying the desired name when creating a production variant. Defaults to None.

    • vpc_config: A dictionary specifying the VPC configuration to use when creating the new SageMaker model associated with this application. The acceptable values for this parameter are identical to those of the VpcConfig parameter in the SageMaker boto3 client’s create_model method. For more information, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_VpcConfig.html. Defaults to None.

    • data_capture_config: A dictionary specifying the data capture configuration to use when creating the new SageMaker model associated with this application. For more information, see https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DataCaptureConfig.html. Defaults to None.

    • async_inference_config: A dictionary specifying the async config configuration. Defaults to None.

    • env: A dictionary specifying environment variables as key-value pairs to be set for the deployed model. Defaults to None.

    • tags: A dictionary of key-value pairs representing additional tags to be set for the deployed model. Defaults to None.

  • endpoint – (optional) Endpoint containing the deployment to update. Currently unsupported

Python example
from mlflow.deployments import get_deploy_client

vpc_config = {
    "SecurityGroupIds": [
        "sg-123456abc",
    ],
    "Subnets": [
        "subnet-123456abc",
    ],
}
data_capture_config = {
    "EnableCapture": True,
    "InitalSamplingPercentage": 100,
    "DestinationS3Uri": "s3://my-bucket/path",
    "CaptureOptions": [{"CaptureMode": "Output"}],
}
config = dict(
    assume_role_arn="arn:aws:123:role/assumed_role",
    execution_role_arn="arn:aws:456:role/execution_role",
    bucket_name="my-s3-bucket",
    image_url="1234.dkr.ecr.us-east-1.amazonaws.com/mlflow-test:1.23.1",
    region_name="us-east-1",
    mode="replace",
    archive=False,
    instance_type="ml.m5.4xlarge",
    instance_count=1,
    synchronous=True,
    timeout_seconds=300,
    variant_name="prod-variant-1",
    vpc_config=vpc_config,
    data_capture_config=data_capture_config,
    env={"DISABLE_NGINX": "true", "GUNICORN_CMD_ARGS": '"--timeout 60"'},
    tags={"training_timestamp": "2022-11-01T05:12:26"},
)
client = get_deploy_client("sagemaker")
client.update_deployment(
    "my-deployment",
    model_uri="/mlruns/0/abc/model",
    flavor="python_function",
    config=config,
)
Command-line example
mlflow deployments update --target sagemaker:/us-east-1/arn:aws:123:role/assumed_role \
        --name my-deployment \
        --model-uri /mlruns/0/abc/model \
        --flavor python_function\
        -C execution_role_arn=arn:aws:456:role/execution_role \
        -C bucket_name=my-s3-bucket \
        -C image_url=1234.dkr.ecr.us-east-1.amazonaws.com/mlflow-test:1.23.1 \
        -C region_name=us-east-1 \
        -C mode=replace \
        -C archive=False \
        -C instance_type=ml.m5.4xlarge \
        -C instance_count=1 \
        -C synchronous=True \
        -C timeout_seconds=300 \
        -C variant_name=prod-variant-1 \
        -C vpc_config='{"SecurityGroupIds": ["sg-123456abc"], \
        "Subnets": ["subnet-123456abc"]}' \
        -C data_capture_config='{"EnableCapture": True, \
        "InitalSamplingPercentage": 100, "DestinationS3Uri": "s3://my-bucket/path", \
        "CaptureOptions": [{"CaptureMode": "Output"}]}'
        -C env='{"DISABLE_NGINX": "true", "GUNICORN_CMD_ARGS": ""--timeout 60""}' \
        -C tags='{"training_timestamp": "2022-11-01T05:12:26"}' \
update_endpoint(endpoint, config=None)[source]

Update the endpoint with the specified name. You can update any target-specific attributes of the endpoint (via config). By default, this method should block until the update completes (i.e. until it’s possible to create a deployment within the endpoint). See target-specific plugin documentation for additional detail on support for asynchronous update and other configuration.

Parameters
  • endpoint – Unique name of endpoint to update

  • config – (optional) dict containing target-specific configuration for the endpoint

mlflow.sagemaker.deploy_transform_job(job_name, model_uri, s3_input_data_type, s3_input_uri, content_type, s3_output_path, compression_type='None', split_type='Line', accept='text/csv', assemble_with='Line', input_filter='$', output_filter='$', join_resource='None', execution_role_arn=None, assume_role_arn=None, bucket=None, image_url=None, region_name='us-west-2', instance_type='ml.m4.xlarge', instance_count=1, vpc_config=None, flavor=None, archive=False, synchronous=True, timeout_seconds=1200)[source]

Deploy an MLflow model on AWS SageMaker and create the corresponding batch transform job. The currently active AWS account must have correct permissions set up.

Parameters
  • job_name – Name of the deployed Sagemaker batch transform job.

  • model_uri

    The location, in URI format, of the MLflow model to deploy to SageMaker. For example:

    • /Users/me/path/to/local/model

    • relative/path/to/local/model

    • s3://my_bucket/path/to/model

    • runs:/<mlflow_run_id>/run-relative/path/to/model

    • models:/<model_name>/<model_version>

    • models:/<model_name>/<stage>

    For more information about supported URI schemes, see Referencing Artifacts.

  • s3_input_data_type – Input data type for the transform job.

  • s3_input_uri – S3 key name prefix or a manifest of the input data.

  • content_type – The multipurpose internet mail extension (MIME) type of the data.

  • s3_output_path – The S3 path to store the output results of the Sagemaker transform job.

  • compression_type – The compression type of the transform data.

  • split_type – The method to split the transform job’s data files into smaller batches.

  • accept – The multipurpose internet mail extension (MIME) type of the output data.

  • assemble_with – The method to assemble the results of the transform job as a single S3 object.

  • input_filter – A JSONPath expression used to select a portion of the input data for the transform job.

  • output_filter – A JSONPath expression used to select a portion of the output data from the transform job.

  • join_resource – The source of the data to join with the transformed data.

  • execution_role_arn

    The name of an IAM role granting the SageMaker service permissions to access the specified Docker image and S3 bucket containing MLflow model artifacts. If unspecified, the currently-assumed role will be used. This execution role is passed to the SageMaker service when creating a SageMaker model from the specified MLflow model. It is passed as the ExecutionRoleArn parameter of the SageMaker CreateModel API call. This role is not assumed for any other call. For more information about SageMaker execution roles for model creation, see https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html.

  • assume_role_arn – The name of an IAM cross-account role to be assumed to deploy SageMaker to another AWS account. If unspecified, SageMaker will be deployed to the the currently active AWS account.

  • bucket – S3 bucket where model artifacts will be stored. Defaults to a SageMaker-compatible bucket name.

  • image_url – URL of the ECR-hosted Docker image the model should be deployed into, produced by mlflow sagemaker build-and-push-container. This parameter can also be specified by the environment variable MLFLOW_SAGEMAKER_DEPLOY_IMG_URL.

  • region_name – Name of the AWS region to which to deploy the application.

  • instance_type – The type of SageMaker ML instance on which to deploy the model. For a list of supported instance types, see https://aws.amazon.com/sagemaker/pricing/instance-types/.

  • instance_count – The number of SageMaker ML instances on which to deploy the model.

  • vpc_config

    A dictionary specifying the VPC configuration to use when creating the new SageMaker model associated with this batch transform job. The acceptable values for this parameter are identical to those of the VpcConfig parameter in the SageMaker boto3 client’s create_model method. For more information, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_VpcConfig.html.

    Example
    import mlflow.sagemaker as mfs
    
    vpc_config = {
        "SecurityGroupIds": [
            "sg-123456abc",
        ],
        "Subnets": [
            "subnet-123456abc",
        ],
    }
    mfs.deploy_transform_job(..., vpc_config=vpc_config)
    

  • flavor – The name of the flavor of the model to use for deployment. Must be either None or one of mlflow.sagemaker.SUPPORTED_DEPLOYMENT_FLAVORS. If None, a flavor is automatically selected from the model’s available flavors. If the specified flavor is not present or not supported for deployment, an exception will be thrown.

  • archive – If True, resources like Sagemaker models and model artifacts in S3 are preserved after the finished batch transform job. If False, these resources are deleted. In order to use archive=False, deploy_transform_job() must be executed synchronously with synchronous=True.

  • synchronous – If True, this function will block until the deployment process succeeds or encounters an irrecoverable failure. If False, this function will return immediately after starting the deployment process. It will not wait for the deployment process to complete; in this case, the caller is responsible for monitoring the health and status of the pending deployment via native SageMaker APIs or the AWS console.

  • timeout_seconds – If synchronous is True, the deployment process will return after the specified number of seconds if no definitive result (success or failure) is achieved. Once the function returns, the caller is responsible for monitoring the health and status of the pending deployment using native SageMaker APIs or the AWS console. If synchronous is False, this parameter is ignored.

mlflow.sagemaker.push_image_to_ecr(image='mlflow-pyfunc')[source]

Push local Docker image to AWS ECR.

The image is pushed under currently active AWS account and to the currently active AWS region.

Parameters

image – Docker image name.

mlflow.sagemaker.push_model_to_sagemaker(model_name, model_uri, execution_role_arn=None, assume_role_arn=None, bucket=None, image_url=None, region_name='us-west-2', vpc_config=None, flavor=None)[source]

Create a SageMaker Model from an MLflow model artifact. The currently active AWS account must have correct permissions set up.

Parameters
  • model_name – Name of the Sagemaker model.

  • model_uri

    The location, in URI format, of the MLflow model to deploy to SageMaker. For example:

    • /Users/me/path/to/local/model

    • relative/path/to/local/model

    • s3://my_bucket/path/to/model

    • runs:/<mlflow_run_id>/run-relative/path/to/model

    • models:/<model_name>/<model_version>

    • models:/<model_name>/<stage>

    For more information about supported URI schemes, see Referencing Artifacts.

  • execution_role_arn

    The name of an IAM role granting the SageMaker service permissions to access the specified Docker image and S3 bucket containing MLflow model artifacts. If unspecified, the currently-assumed role will be used. This execution role is passed to the SageMaker service when creating a SageMaker model from the specified MLflow model. It is passed as the ExecutionRoleArn parameter of the SageMaker CreateModel API call. This role is not assumed for any other call. For more information about SageMaker execution roles for model creation, see https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html.

  • assume_role_arn – The name of an IAM cross-account role to be assumed to deploy SageMaker to another AWS account. If unspecified, SageMaker will be deployed to the the currently active AWS account.

  • bucket – S3 bucket where model artifacts will be stored. Defaults to a SageMaker-compatible bucket name.

  • image_url – URL of the ECR-hosted Docker image the model should be deployed into, produced by mlflow sagemaker build-and-push-container. This parameter can also be specified by the environment variable MLFLOW_SAGEMAKER_DEPLOY_IMG_URL.

  • region_name – Name of the AWS region to which to deploy the application.

  • vpc_config

    A dictionary specifying the VPC configuration to use when creating the new SageMaker model. The acceptable values for this parameter are identical to those of the VpcConfig parameter in the SageMaker boto3 client’s create_model method. For more information, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_VpcConfig.html.

    Example
    import mlflow.sagemaker as mfs
    
    vpc_config = {
        "SecurityGroupIds": [
            "sg-123456abc",
        ],
        "Subnets": [
            "subnet-123456abc",
        ],
    }
    mfs.push_model_to_sagemaker(..., vpc_config=vpc_config)
    

  • flavor – The name of the flavor of the model to use for deployment. Must be either None or one of mlflow.sagemaker.SUPPORTED_DEPLOYMENT_FLAVORS. If None, a flavor is automatically selected from the model’s available flavors. If the specified flavor is not present or not supported for deployment, an exception will be thrown.

mlflow.sagemaker.run_local(name, model_uri, flavor=None, config=None)[source]

Serve the model locally in a SageMaker compatible Docker container.

Note that models deployed locally cannot be managed by other deployment APIs (e.g. update_deployment, delete_deployment, etc).

Parameters
  • name – Name of the local serving application.

  • model_uri

    The location, in URI format, of the MLflow model to deploy locally. For example:

    • /Users/me/path/to/local/model

    • relative/path/to/local/model

    • s3://my_bucket/path/to/model

    • runs:/<mlflow_run_id>/run-relative/path/to/model

    • models:/<model_name>/<model_version>

    • models:/<model_name>/<stage>

    For more information about supported URI schemes, see Referencing Artifacts.

  • flavor – The name of the flavor of the model to use for deployment. Must be either None or one of mlflow.sagemaker.SUPPORTED_DEPLOYMENT_FLAVORS. If None, a flavor is automatically selected from the model’s available flavors. If the specified flavor is not present or not supported for deployment, an exception will be thrown.

  • config

    Configuration parameters. The supported parameters are:

    • image: The name of the Docker image to use for model serving. Defaults

      to "mlflow-pyfunc".

    • port: The port at which to expose the model server on the local host.

      Defaults to 5000.

Python example
from mlflow.models import build_docker
from mlflow.deployments import get_deploy_client

build_docker(name="mlflow-pyfunc")

client = get_deploy_client("sagemaker")
client.run_local(
    name="my-local-deployment",
    model_uri="/mlruns/0/abc/model",
    flavor="python_function",
    config={
        "port": 5000,
        "image": "mlflow-pyfunc",
    },
)
Command-line example
mlflow models build-docker --name "mlflow-pyfunc"
mlflow deployments run-local --target sagemaker \
        --name my-local-deployment \
        --model-uri "/mlruns/0/abc/model" \
        --flavor python_function \
        -C port=5000 \
        -C image="mlflow-pyfunc"
mlflow.sagemaker.target_help()[source]

Provide help information for the SageMaker deployment client.

mlflow.sagemaker.terminate_transform_job(job_name, region_name='us-west-2', assume_role_arn=None, archive=False, synchronous=True, timeout_seconds=300)[source]

Terminate a SageMaker batch transform job.

Parameters
  • job_name – Name of the deployed Sagemaker batch transform job.

  • region_name – Name of the AWS region in which the batch transform job is deployed.

  • assume_role_arn – The name of an IAM cross-account role to be assumed to deploy SageMaker to another AWS account. If unspecified, SageMaker will be deployed to the the currently active AWS account.

  • archive – If True, resources associated with the specified batch transform job, such as its associated models and model artifacts, are preserved. If False, these resources are deleted. In order to use archive=False, terminate_transform_job() must be executed synchronously with synchronous=True.

  • synchronous – If True, this function blocks until the termination process succeeds or encounters an irrecoverable failure. If False, this function returns immediately after starting the termination process. It will not wait for the termination process to complete; in this case, the caller is responsible for monitoring the status of the termination process via native SageMaker APIs or the AWS console.

  • timeout_seconds – If synchronous is True, the termination process returns after the specified number of seconds if no definitive result (success or failure) is achieved. Once the function returns, the caller is responsible for monitoring the status of the termination process via native SageMaker APIs or the AWS console. If synchronous is False, this parameter is ignored.