mlflow.sagemaker
The mlflow.sagemaker
module provides an API for deploying MLflow models to Amazon SageMaker.
-
class
mlflow.sagemaker.
SageMakerDeploymentClient
(target_uri)[source] Bases:
mlflow.deployments.base.BaseDeploymentClient
Initialize a deployment client for SageMaker. The default region and assumed role ARN will be set according to the value of the target_uri.
This class is meant to supercede the other
mlflow.sagemaker
real-time serving API’s. It is also designed to be used through themlflow.deployments
module. This means that you can deploy to SageMaker using the mlflow deployments CLI and get a client through themlflow.deployments.get_deploy_client
function.- Parameters
target_uri –
A URI that follows one of the following formats:
sagemaker
: This will set the default region to us-west-2 and the default assumed role ARN to None.sagemaker:/region_name
: This will set the default region to region_name and the default assumed role ARN to None.sagemaker:/region_name/assumed_role_arn
: This will set the default region to region_name and the default assumed role ARN to assumed_role_arn.
When an assumed_role_arn is provided without a region_name, an MlflowException will be raised.
-
create_deployment
(name, model_uri, flavor=None, config=None, endpoint=None)[source] Deploy an MLflow model on AWS SageMaker. The currently active AWS account must have correct permissions set up.
This function creates a SageMaker endpoint. For more information about the input data formats accepted by this endpoint, see the MLflow deployment tools documentation.
- Parameters
name – Name of the deployed application.
model_uri –
The location, in URI format, of the MLflow model to deploy to SageMaker. For example:
/Users/me/path/to/local/model
relative/path/to/local/model
s3://my_bucket/path/to/model
runs:/<mlflow_run_id>/run-relative/path/to/model
models:/<model_name>/<model_version>
models:/<model_name>/<stage>
For more information about supported URI schemes, see Referencing Artifacts.
flavor – The name of the flavor of the model to use for deployment. Must be either
None
or one of mlflow.sagemaker.SUPPORTED_DEPLOYMENT_FLAVORS. IfNone
, a flavor is automatically selected from the model’s available flavors. If the specified flavor is not present or not supported for deployment, an exception will be thrown.config –
Configuration parameters. The supported parameters are:
assume_role_arn
: The name of an IAM cross-account role to be assumed to deploy SageMaker to another AWS account. If this parameter is not specified, the role given in thetarget_uri
will be used. If the role is not given in thetarget_uri
, defaults tous-west-2
.execution_role_arn
: The name of an IAM role granting the SageMaker service permissions to access the specified Docker image and S3 bucket containing MLflow model artifacts. If unspecified, the currently-assumed role will be used. This execution role is passed to the SageMaker service when creating a SageMaker model from the specified MLflow model. It is passed as theExecutionRoleArn
parameter of the SageMaker CreateModel API call. This role is not assumed for any other call. For more information about SageMaker execution roles for model creation, see https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html.bucket
: S3 bucket where model artifacts will be stored. Defaults to a SageMaker-compatible bucket name.image_url
: URL of the ECR-hosted Docker image the model should be deployed into, produced bymlflow sagemaker build-and-push-container
. This parameter can also be specified by the environment variableMLFLOW_SAGEMAKER_DEPLOY_IMG_URL
.region_name
: Name of the AWS region to which to deploy the application. If unspecified, use the region name given in thetarget_uri
. If it is also not specified in thetarget_uri
, defaults tous-west-2
.archive
: IfTrue
, any pre-existing SageMaker application resources that become inactive (i.e. as a result of deploying inmlflow.sagemaker.DEPLOYMENT_MODE_REPLACE
mode) are preserved. These resources may include unused SageMaker models and endpoint configurations that were associated with a prior version of the application endpoint. IfFalse
, these resources are deleted. In order to usearchive=False
,create_deployment()
must be executed synchronously withsynchronous=True
. Defaults toFalse
.instance_type
: The type of SageMaker ML instance on which to deploy the model. For a list of supported instance types, see https://aws.amazon.com/sagemaker/pricing/instance-types/. Defaults toml.m4.xlarge
.instance_count
: The number of SageMaker ML instances on which to deploy the model. Defaults to1
.synchronous
: IfTrue
, this function will block until the deployment process succeeds or encounters an irrecoverable failure. IfFalse
, this function will return immediately after starting the deployment process. It will not wait for the deployment process to complete; in this case, the caller is responsible for monitoring the health and status of the pending deployment via native SageMaker APIs or the AWS console. Defaults toTrue
.timeout_seconds
: Ifsynchronous
isTrue
, the deployment process will return after the specified number of seconds if no definitive result (success or failure) is achieved. Once the function returns, the caller is responsible for monitoring the health and status of the pending deployment using native SageMaker APIs or the AWS console. Ifsynchronous
isFalse
, this parameter is ignored. Defaults to300
.vpc_config
: A dictionary specifying the VPC configuration to use when creating the new SageMaker model associated with this application. The acceptable values for this parameter are identical to those of theVpcConfig
parameter in the SageMaker boto3 client’s create_model method. For more information, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_VpcConfig.html. Defaults toNone
.data_capture_config
: A dictionary specifying the data capture configuration to use when creating the new SageMaker model associated with this application. For more information, see https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DataCaptureConfig.html. Defaults toNone
.variant_name
: A string specifying the desired name when creating a production variant. Defaults toNone
.async_inference_config
: A dictionary specifying the async_inference_configurationserverless_config
: A dictionary specifying the serverless_configurationenv
: A dictionary specifying environment variables as key-value pairs to be set for the deployed model. Defaults toNone
.tags
: A dictionary of key-value pairs representing additional tags to be set for the deployed model. Defaults toNone
.
endpoint – (optional) Endpoint to create the deployment under. Currently unsupported
from mlflow.deployments import get_deploy_client vpc_config = { "SecurityGroupIds": [ "sg-123456abc", ], "Subnets": [ "subnet-123456abc", ], } config = dict( assume_role_arn="arn:aws:123:role/assumed_role", execution_role_arn="arn:aws:456:role/execution_role", bucket_name="my-s3-bucket", image_url="1234.dkr.ecr.us-east-1.amazonaws.com/mlflow-test:1.23.1", region_name="us-east-1", archive=False, instance_type="ml.m5.4xlarge", instance_count=1, synchronous=True, timeout_seconds=300, vpc_config=vpc_config, variant_name="prod-variant-1", env={"DISABLE_NGINX": "true", "GUNICORN_CMD_ARGS": '"--timeout 60"'}, tags={"training_timestamp": "2022-11-01T05:12:26"}, ) client = get_deploy_client("sagemaker") client.create_deployment( "my-deployment", model_uri="/mlruns/0/abc/model", flavor="python_function", config=config, )
mlflow deployments create --target sagemaker:/us-east-1/arn:aws:123:role/assumed_role \ --name my-deployment \ --model-uri /mlruns/0/abc/model \ --flavor python_function\ -C execution_role_arn=arn:aws:456:role/execution_role \ -C bucket_name=my-s3-bucket \ -C image_url=1234.dkr.ecr.us-east-1.amazonaws.com/mlflow-test:1.23.1 \ -C region_name=us-east-1 \ -C archive=False \ -C instance_type=ml.m5.4xlarge \ -C instance_count=1 \ -C synchronous=True \ -C timeout_seconds=300 \ -C variant_name=prod-variant-1 \ -C vpc_config='{"SecurityGroupIds": ["sg-123456abc"], \ "Subnets": ["subnet-123456abc"]}' \ -C data_capture_config='{"EnableCapture": True, \ 'InitalSamplingPercentage': 100, 'DestinationS3Uri": 's3://my-bucket/path', \ 'CaptureOptions': [{'CaptureMode': 'Output'}]}' -C env='{"DISABLE_NGINX": "true", "GUNICORN_CMD_ARGS": ""--timeout 60""}' \ -C tags='{"training_timestamp": "2022-11-01T05:12:26"}' \
-
create_endpoint
(name, config=None)[source] Create an endpoint with the specified target. By default, this method should block until creation completes (i.e. until it’s possible to create a deployment within the endpoint). In the case of conflicts (e.g. if it’s not possible to create the specified endpoint due to conflict with an existing endpoint), raises a
mlflow.exceptions.MlflowException
. See target-specific plugin documentation for additional detail on support for asynchronous creation and other configuration.- Parameters
name – Unique name to use for endpoint. If another endpoint exists with the same name, raises a
mlflow.exceptions.MlflowException
.config – (optional) Dict containing target-specific configuration for the endpoint.
- Returns
Dict corresponding to created endpoint, which must contain the ‘name’ key.
-
delete_deployment
(name, config=None, endpoint=None)[source] Delete a SageMaker application.
- Parameters
name – Name of the deployed application.
config –
Configuration parameters. The supported parameters are:
assume_role_arn
: The name of an IAM role to be assumed to delete the SageMaker deployment.region_name
: Name of the AWS region in which the application is deployed. Defaults tous-west-2
or the region provided in the target_uri.archive
: If True, resources associated with the specified application, such as its associated models and endpoint configuration, are preserved. If False, these resources are deleted. In order to usearchive=False
,delete()
must be executed synchronously withsynchronous=True
. Defaults toFalse
.synchronous
: If True, this function blocks until the deletion process succeeds or encounters an irrecoverable failure. If False, this function returns immediately after starting the deletion process. It will not wait for the deletion process to complete; in this case, the caller is responsible for monitoring the status of the deletion process via native SageMaker APIs or the AWS console. Defaults toTrue
.timeout_seconds
: If synchronous is True, the deletion process returns after the specified number of seconds if no definitive result (success or failure) is achieved. Once the function returns, the caller is responsible for monitoring the status of the deletion process via native SageMaker APIs or the AWS console. If synchronous is False, this parameter is ignored. Defaults to300
.
endpoint – (optional) Endpoint containing the deployment to delete. Currently unsupported
from mlflow.deployments import get_deploy_client config = dict( assume_role_arn="arn:aws:123:role/assumed_role", region_name="us-east-1", archive=False, synchronous=True, timeout_seconds=300, ) client = get_deploy_client("sagemaker") client.delete_deployment("my-deployment", config=config)
mlflow deployments delete --target sagemaker \ --name my-deployment \ -C assume_role_arn=arn:aws:123:role/assumed_role \ -C region_name=us-east-1 \ -C archive=False \ -C synchronous=True \ -C timeout_seconds=300
-
delete_endpoint
(endpoint)[source] Delete the endpoint from the specified target. Deletion should be idempotent (i.e. deletion should not fail if retried on a non-existent deployment).
- Parameters
endpoint – Name of endpoint to delete
-
explain
(deployment_name=None, df=None, endpoint=None)[source] This function has not been implemented and will be coming in the future.
-
get_deployment
(name, endpoint=None)[source] Returns a dictionary describing the specified deployment.
If a region name needs to be specified, the plugin must be initialized with the AWS region in the
target_uri
such assagemaker:/us-east-1
.To assume an IAM role, the plugin must be initialized with the AWS region and the role ARN in the
target_uri
such assagemaker:/us-east-1/arn:aws:1234:role/assumed_role
.A
mlflow.exceptions.MlflowException
will also be thrown when an error occurs while retrieving the deployment.- Parameters
name – Name of deployment to retrieve
endpoint – (optional) Endpoint containing the deployment to get. Currently unsupported
- Returns
A dictionary that describes the specified deployment
from mlflow.deployments import get_deploy_client client = get_deploy_client("sagemaker:/us-east-1/arn:aws:123:role/assumed_role") client.get_deployment("my-deployment")
mlflow deployments get --target sagemaker:/us-east-1/arn:aws:1234:role/assumed_role \ --name my-deployment
-
get_endpoint
(endpoint)[source] Returns a dictionary describing the specified endpoint, throwing a py:class:mlflow.exception.MlflowException if no endpoint exists with the provided name. The dict is guaranteed to contain an ‘name’ key containing the endpoint name. The other fields of the returned dictionary and their types may vary across targets.
- Parameters
endpoint – Name of endpoint to fetch
-
list_deployments
(endpoint=None)[source] List deployments. This method returns a list of dictionaries that describes each deployment.
If a region name needs to be specified, the plugin must be initialized with the AWS region in the
target_uri
such assagemaker:/us-east-1
.To assume an IAM role, the plugin must be initialized with the AWS region and the role ARN in the
target_uri
such assagemaker:/us-east-1/arn:aws:1234:role/assumed_role
.- Parameters
endpoint – (optional) List deployments in the specified endpoint. Currently unsupported
- Returns
A list of dictionaries corresponding to deployments.
from mlflow.deployments import get_deploy_client client = get_deploy_client("sagemaker:/us-east-1/arn:aws:123:role/assumed_role") client.list_deployments()
mlflow deployments list --target sagemaker:/us-east-1/arn:aws:1234:role/assumed_role
-
list_endpoints
()[source] List endpoints in the specified target. This method is expected to return an unpaginated list of all endpoints (an alternative would be to return a dict with an ‘endpoints’ field containing the actual endpoints, with plugins able to specify other fields, e.g. a next_page_token field, in the returned dictionary for pagination, and to accept a pagination_args argument to this method for passing pagination-related args).
- Returns
A list of dicts corresponding to endpoints. Each dict is guaranteed to contain a ‘name’ key containing the endpoint name. The other fields of the returned dictionary and their types may vary across targets.
-
predict
(deployment_name=None, inputs=None, endpoint=None, params: Optional[Dict[str, Any]] = None)[source] Compute predictions from the specified deployment using the provided PyFunc input.
The input/output types of this method match the MLflow PyFunc prediction interface.
If a region name needs to be specified, the plugin must be initialized with the AWS region in the
target_uri
such assagemaker:/us-east-1
.To assume an IAM role, the plugin must be initialized with the AWS region and the role ARN in the
target_uri
such assagemaker:/us-east-1/arn:aws:1234:role/assumed_role
.- Parameters
deployment_name – Name of the deployment to predict against.
inputs – Input data (or arguments) to pass to the deployment or model endpoint for inference. For a complete list of supported input types, see Inference API.
endpoint – Endpoint to predict against. Currently unsupported
params – Optional parameters to invoke the endpoint with.
- Returns
A PyFunc output, such as a Pandas DataFrame, Pandas Series, or NumPy array. For a complete list of supported output types, see Inference API.
import pandas as pd from mlflow.deployments import get_deploy_client df = pd.DataFrame(data=[[1, 2, 3]], columns=["feat1", "feat2", "feat3"]) client = get_deploy_client("sagemaker:/us-east-1/arn:aws:123:role/assumed_role") client.predict("my-deployment", df)
cat > ./input.json <<- input {"feat1": {"0": 1}, "feat2": {"0": 2}, "feat3": {"0": 3}} input mlflow deployments predict \ --target sagemaker:/us-east-1/arn:aws:1234:role/assumed_role \ --name my-deployment \ --input-path ./input.json
-
update_deployment
(name, model_uri, flavor=None, config=None, endpoint=None)[source] Update a deployment on AWS SageMaker. This function can replace or add a new model to an existing SageMaker endpoint. By default, this function replaces the existing model with the new one. The currently active AWS account must have correct permissions set up.
- Parameters
name – Name of the deployed application.
model_uri –
The location, in URI format, of the MLflow model to deploy to SageMaker. For example:
/Users/me/path/to/local/model
relative/path/to/local/model
s3://my_bucket/path/to/model
runs:/<mlflow_run_id>/run-relative/path/to/model
models:/<model_name>/<model_version>
models:/<model_name>/<stage>
For more information about supported URI schemes, see Referencing Artifacts.
flavor – The name of the flavor of the model to use for deployment. Must be either
None
or one of mlflow.sagemaker.SUPPORTED_DEPLOYMENT_FLAVORS. IfNone
, a flavor is automatically selected from the model’s available flavors. If the specified flavor is not present or not supported for deployment, an exception will be thrown.config –
Configuration parameters. The supported parameters are:
assume_role_arn
: The name of an IAM cross-account role to be assumed to deploy SageMaker to another AWS account. If this parameter is not specified, the role given in thetarget_uri
will be used. If the role is not given in thetarget_uri
, defaults tous-west-2
.execution_role_arn
: The name of an IAM role granting the SageMaker service permissions to access the specified Docker image and S3 bucket containing MLflow model artifacts. If unspecified, the currently-assumed role will be used. This execution role is passed to the SageMaker service when creating a SageMaker model from the specified MLflow model. It is passed as theExecutionRoleArn
parameter of the SageMaker CreateModel API call. This role is not assumed for any other call. For more information about SageMaker execution roles for model creation, see https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html.bucket
: S3 bucket where model artifacts will be stored. Defaults to a SageMaker-compatible bucket name.image_url
: URL of the ECR-hosted Docker image the model should be deployed into, produced bymlflow sagemaker build-and-push-container
. This parameter can also be specified by the environment variableMLFLOW_SAGEMAKER_DEPLOY_IMG_URL
.region_name
: Name of the AWS region to which to deploy the application. If unspecified, use the region name given in thetarget_uri
. If it is also not specified in thetarget_uri
, defaults tous-west-2
.mode
: The mode in which to deploy the application. Must be one of the following:mlflow.sagemaker.DEPLOYMENT_MODE_REPLACE
If an application of the specified name exists, its model(s) is replaced with the specified model. If no such application exists, it is created with the specified name and model. This is the default mode.
mlflow.sagemaker.DEPLOYMENT_MODE_ADD
Add the specified model to a pre-existing application with the specified name, if one exists. If the application does not exist, a new application is created with the specified name and model. NOTE: If the application already exists, the specified model is added to the application’s corresponding SageMaker endpoint with an initial weight of zero (0). To route traffic to the model, update the application’s associated endpoint configuration using either the AWS console or the
UpdateEndpointWeightsAndCapacities
function defined in https://docs.aws.amazon.com/sagemaker/latest/dg/API_UpdateEndpointWeightsAndCapacities.html.
archive
: IfTrue
, any pre-existing SageMaker application resources that become inactive (i.e. as a result of deploying inmlflow.sagemaker.DEPLOYMENT_MODE_REPLACE
mode) are preserved. These resources may include unused SageMaker models and endpoint configurations that were associated with a prior version of the application endpoint. IfFalse
, these resources are deleted. In order to usearchive=False
,update_deployment()
must be executed synchronously withsynchronous=True
. Defaults toFalse
.instance_type
: The type of SageMaker ML instance on which to deploy the model. For a list of supported instance types, see https://aws.amazon.com/sagemaker/pricing/instance-types/. Defaults toml.m4.xlarge
.instance_count
: The number of SageMaker ML instances on which to deploy the model. Defaults to1
.synchronous
: IfTrue
, this function will block until the deployment process succeeds or encounters an irrecoverable failure. IfFalse
, this function will return immediately after starting the deployment process. It will not wait for the deployment process to complete; in this case, the caller is responsible for monitoring the health and status of the pending deployment via native SageMaker APIs or the AWS console. Defaults toTrue
.timeout_seconds
: Ifsynchronous
isTrue
, the deployment process will return after the specified number of seconds if no definitive result (success or failure) is achieved. Once the function returns, the caller is responsible for monitoring the health and status of the pending deployment using native SageMaker APIs or the AWS console. Ifsynchronous
isFalse
, this parameter is ignored. Defaults to300
.variant_name
: A string specifying the desired name when creating a production variant. Defaults toNone
.vpc_config
: A dictionary specifying the VPC configuration to use when creating the new SageMaker model associated with this application. The acceptable values for this parameter are identical to those of theVpcConfig
parameter in the SageMaker boto3 client’s create_model method. For more information, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_VpcConfig.html. Defaults toNone
.data_capture_config
: A dictionary specifying the data capture configuration to use when creating the new SageMaker model associated with this application. For more information, see https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_DataCaptureConfig.html. Defaults toNone
.async_inference_config
: A dictionary specifying the async config configuration. Defaults toNone
.env
: A dictionary specifying environment variables as key-value pairs to be set for the deployed model. Defaults toNone
.tags
: A dictionary of key-value pairs representing additional tags to be set for the deployed model. Defaults toNone
.
endpoint – (optional) Endpoint containing the deployment to update. Currently unsupported
from mlflow.deployments import get_deploy_client vpc_config = { "SecurityGroupIds": [ "sg-123456abc", ], "Subnets": [ "subnet-123456abc", ], } data_capture_config = { "EnableCapture": True, "InitalSamplingPercentage": 100, "DestinationS3Uri": "s3://my-bucket/path", "CaptureOptions": [{"CaptureMode": "Output"}], } config = dict( assume_role_arn="arn:aws:123:role/assumed_role", execution_role_arn="arn:aws:456:role/execution_role", bucket_name="my-s3-bucket", image_url="1234.dkr.ecr.us-east-1.amazonaws.com/mlflow-test:1.23.1", region_name="us-east-1", mode="replace", archive=False, instance_type="ml.m5.4xlarge", instance_count=1, synchronous=True, timeout_seconds=300, variant_name="prod-variant-1", vpc_config=vpc_config, data_capture_config=data_capture_config, env={"DISABLE_NGINX": "true", "GUNICORN_CMD_ARGS": '"--timeout 60"'}, tags={"training_timestamp": "2022-11-01T05:12:26"}, ) client = get_deploy_client("sagemaker") client.update_deployment( "my-deployment", model_uri="/mlruns/0/abc/model", flavor="python_function", config=config, )
mlflow deployments update --target sagemaker:/us-east-1/arn:aws:123:role/assumed_role \ --name my-deployment \ --model-uri /mlruns/0/abc/model \ --flavor python_function\ -C execution_role_arn=arn:aws:456:role/execution_role \ -C bucket_name=my-s3-bucket \ -C image_url=1234.dkr.ecr.us-east-1.amazonaws.com/mlflow-test:1.23.1 \ -C region_name=us-east-1 \ -C mode=replace \ -C archive=False \ -C instance_type=ml.m5.4xlarge \ -C instance_count=1 \ -C synchronous=True \ -C timeout_seconds=300 \ -C variant_name=prod-variant-1 \ -C vpc_config='{"SecurityGroupIds": ["sg-123456abc"], \ "Subnets": ["subnet-123456abc"]}' \ -C data_capture_config='{"EnableCapture": True, \ "InitalSamplingPercentage": 100, "DestinationS3Uri": "s3://my-bucket/path", \ "CaptureOptions": [{"CaptureMode": "Output"}]}' -C env='{"DISABLE_NGINX": "true", "GUNICORN_CMD_ARGS": ""--timeout 60""}' \ -C tags='{"training_timestamp": "2022-11-01T05:12:26"}' \
-
update_endpoint
(endpoint, config=None)[source] Update the endpoint with the specified name. You can update any target-specific attributes of the endpoint (via config). By default, this method should block until the update completes (i.e. until it’s possible to create a deployment within the endpoint). See target-specific plugin documentation for additional detail on support for asynchronous update and other configuration.
- Parameters
endpoint – Unique name of endpoint to update
config – (optional) dict containing target-specific configuration for the endpoint
-
mlflow.sagemaker.
deploy_transform_job
(job_name, model_uri, s3_input_data_type, s3_input_uri, content_type, s3_output_path, compression_type='None', split_type='Line', accept='text/csv', assemble_with='Line', input_filter='$', output_filter='$', join_resource='None', execution_role_arn=None, assume_role_arn=None, bucket=None, image_url=None, region_name='us-west-2', instance_type='ml.m4.xlarge', instance_count=1, vpc_config=None, flavor=None, archive=False, synchronous=True, timeout_seconds=1200)[source] Deploy an MLflow model on AWS SageMaker and create the corresponding batch transform job. The currently active AWS account must have correct permissions set up.
- Parameters
job_name – Name of the deployed Sagemaker batch transform job.
model_uri –
The location, in URI format, of the MLflow model to deploy to SageMaker. For example:
/Users/me/path/to/local/model
relative/path/to/local/model
s3://my_bucket/path/to/model
runs:/<mlflow_run_id>/run-relative/path/to/model
models:/<model_name>/<model_version>
models:/<model_name>/<stage>
For more information about supported URI schemes, see Referencing Artifacts.
s3_input_data_type – Input data type for the transform job.
s3_input_uri – S3 key name prefix or a manifest of the input data.
content_type – The multipurpose internet mail extension (MIME) type of the data.
s3_output_path – The S3 path to store the output results of the Sagemaker transform job.
compression_type – The compression type of the transform data.
split_type – The method to split the transform job’s data files into smaller batches.
accept – The multipurpose internet mail extension (MIME) type of the output data.
assemble_with – The method to assemble the results of the transform job as a single S3 object.
input_filter – A JSONPath expression used to select a portion of the input data for the transform job.
output_filter – A JSONPath expression used to select a portion of the output data from the transform job.
join_resource – The source of the data to join with the transformed data.
execution_role_arn –
The name of an IAM role granting the SageMaker service permissions to access the specified Docker image and S3 bucket containing MLflow model artifacts. If unspecified, the currently-assumed role will be used. This execution role is passed to the SageMaker service when creating a SageMaker model from the specified MLflow model. It is passed as the
ExecutionRoleArn
parameter of the SageMaker CreateModel API call. This role is not assumed for any other call. For more information about SageMaker execution roles for model creation, see https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html.assume_role_arn – The name of an IAM cross-account role to be assumed to deploy SageMaker to another AWS account. If unspecified, SageMaker will be deployed to the the currently active AWS account.
bucket – S3 bucket where model artifacts will be stored. Defaults to a SageMaker-compatible bucket name.
image_url – URL of the ECR-hosted Docker image the model should be deployed into, produced by
mlflow sagemaker build-and-push-container
. This parameter can also be specified by the environment variableMLFLOW_SAGEMAKER_DEPLOY_IMG_URL
.region_name – Name of the AWS region to which to deploy the application.
instance_type – The type of SageMaker ML instance on which to deploy the model. For a list of supported instance types, see https://aws.amazon.com/sagemaker/pricing/instance-types/.
instance_count – The number of SageMaker ML instances on which to deploy the model.
vpc_config –
A dictionary specifying the VPC configuration to use when creating the new SageMaker model associated with this batch transform job. The acceptable values for this parameter are identical to those of the
VpcConfig
parameter in the SageMaker boto3 client’s create_model method. For more information, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_VpcConfig.html.import mlflow.sagemaker as mfs vpc_config = { "SecurityGroupIds": [ "sg-123456abc", ], "Subnets": [ "subnet-123456abc", ], } mfs.deploy_transform_job(..., vpc_config=vpc_config)
flavor – The name of the flavor of the model to use for deployment. Must be either
None
or one of mlflow.sagemaker.SUPPORTED_DEPLOYMENT_FLAVORS. IfNone
, a flavor is automatically selected from the model’s available flavors. If the specified flavor is not present or not supported for deployment, an exception will be thrown.archive – If
True
, resources like Sagemaker models and model artifacts in S3 are preserved after the finished batch transform job. IfFalse
, these resources are deleted. In order to usearchive=False
,deploy_transform_job()
must be executed synchronously withsynchronous=True
.synchronous – If
True
, this function will block until the deployment process succeeds or encounters an irrecoverable failure. IfFalse
, this function will return immediately after starting the deployment process. It will not wait for the deployment process to complete; in this case, the caller is responsible for monitoring the health and status of the pending deployment via native SageMaker APIs or the AWS console.timeout_seconds – If
synchronous
isTrue
, the deployment process will return after the specified number of seconds if no definitive result (success or failure) is achieved. Once the function returns, the caller is responsible for monitoring the health and status of the pending deployment using native SageMaker APIs or the AWS console. Ifsynchronous
isFalse
, this parameter is ignored.
-
mlflow.sagemaker.
push_image_to_ecr
(image='mlflow-pyfunc')[source] Push local Docker image to AWS ECR.
The image is pushed under currently active AWS account and to the currently active AWS region.
- Parameters
image – Docker image name.
-
mlflow.sagemaker.
push_model_to_sagemaker
(model_name, model_uri, execution_role_arn=None, assume_role_arn=None, bucket=None, image_url=None, region_name='us-west-2', vpc_config=None, flavor=None)[source] Create a SageMaker Model from an MLflow model artifact. The currently active AWS account must have correct permissions set up.
- Parameters
model_name – Name of the Sagemaker model.
model_uri –
The location, in URI format, of the MLflow model to deploy to SageMaker. For example:
/Users/me/path/to/local/model
relative/path/to/local/model
s3://my_bucket/path/to/model
runs:/<mlflow_run_id>/run-relative/path/to/model
models:/<model_name>/<model_version>
models:/<model_name>/<stage>
For more information about supported URI schemes, see Referencing Artifacts.
execution_role_arn –
The name of an IAM role granting the SageMaker service permissions to access the specified Docker image and S3 bucket containing MLflow model artifacts. If unspecified, the currently-assumed role will be used. This execution role is passed to the SageMaker service when creating a SageMaker model from the specified MLflow model. It is passed as the
ExecutionRoleArn
parameter of the SageMaker CreateModel API call. This role is not assumed for any other call. For more information about SageMaker execution roles for model creation, see https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html.assume_role_arn – The name of an IAM cross-account role to be assumed to deploy SageMaker to another AWS account. If unspecified, SageMaker will be deployed to the the currently active AWS account.
bucket – S3 bucket where model artifacts will be stored. Defaults to a SageMaker-compatible bucket name.
image_url – URL of the ECR-hosted Docker image the model should be deployed into, produced by
mlflow sagemaker build-and-push-container
. This parameter can also be specified by the environment variableMLFLOW_SAGEMAKER_DEPLOY_IMG_URL
.region_name – Name of the AWS region to which to deploy the application.
vpc_config –
A dictionary specifying the VPC configuration to use when creating the new SageMaker model. The acceptable values for this parameter are identical to those of the
VpcConfig
parameter in the SageMaker boto3 client’s create_model method. For more information, see https://docs.aws.amazon.com/sagemaker/latest/dg/API_VpcConfig.html.import mlflow.sagemaker as mfs vpc_config = { "SecurityGroupIds": [ "sg-123456abc", ], "Subnets": [ "subnet-123456abc", ], } mfs.push_model_to_sagemaker(..., vpc_config=vpc_config)
flavor – The name of the flavor of the model to use for deployment. Must be either
None
or one of mlflow.sagemaker.SUPPORTED_DEPLOYMENT_FLAVORS. IfNone
, a flavor is automatically selected from the model’s available flavors. If the specified flavor is not present or not supported for deployment, an exception will be thrown.
-
mlflow.sagemaker.
run_local
(name, model_uri, flavor=None, config=None)[source] Serve the model locally in a SageMaker compatible Docker container.
Note that models deployed locally cannot be managed by other deployment APIs (e.g.
update_deployment
,delete_deployment
, etc).- Parameters
name – Name of the local serving application.
model_uri –
The location, in URI format, of the MLflow model to deploy locally. For example:
/Users/me/path/to/local/model
relative/path/to/local/model
s3://my_bucket/path/to/model
runs:/<mlflow_run_id>/run-relative/path/to/model
models:/<model_name>/<model_version>
models:/<model_name>/<stage>
For more information about supported URI schemes, see Referencing Artifacts.
flavor – The name of the flavor of the model to use for deployment. Must be either
None
or one of mlflow.sagemaker.SUPPORTED_DEPLOYMENT_FLAVORS. IfNone
, a flavor is automatically selected from the model’s available flavors. If the specified flavor is not present or not supported for deployment, an exception will be thrown.config –
Configuration parameters. The supported parameters are:
image
: The name of the Docker image to use for model serving. Defaultsto
"mlflow-pyfunc"
.
port
: The port at which to expose the model server on the local host.Defaults to
5000
.
from mlflow.models import build_docker from mlflow.deployments import get_deploy_client build_docker(name="mlflow-pyfunc") client = get_deploy_client("sagemaker") client.run_local( name="my-local-deployment", model_uri="/mlruns/0/abc/model", flavor="python_function", config={ "port": 5000, "image": "mlflow-pyfunc", }, )
mlflow models build-docker --name "mlflow-pyfunc" mlflow deployments run-local --target sagemaker \ --name my-local-deployment \ --model-uri "/mlruns/0/abc/model" \ --flavor python_function \ -C port=5000 \ -C image="mlflow-pyfunc"
-
mlflow.sagemaker.
target_help
()[source] Provide help information for the SageMaker deployment client.
-
mlflow.sagemaker.
terminate_transform_job
(job_name, region_name='us-west-2', assume_role_arn=None, archive=False, synchronous=True, timeout_seconds=300)[source] Terminate a SageMaker batch transform job.
- Parameters
job_name – Name of the deployed Sagemaker batch transform job.
region_name – Name of the AWS region in which the batch transform job is deployed.
assume_role_arn – The name of an IAM cross-account role to be assumed to deploy SageMaker to another AWS account. If unspecified, SageMaker will be deployed to the the currently active AWS account.
archive – If
True
, resources associated with the specified batch transform job, such as its associated models and model artifacts, are preserved. IfFalse
, these resources are deleted. In order to usearchive=False
,terminate_transform_job()
must be executed synchronously withsynchronous=True
.synchronous – If True, this function blocks until the termination process succeeds or encounters an irrecoverable failure. If False, this function returns immediately after starting the termination process. It will not wait for the termination process to complete; in this case, the caller is responsible for monitoring the status of the termination process via native SageMaker APIs or the AWS console.
timeout_seconds – If synchronous is True, the termination process returns after the specified number of seconds if no definitive result (success or failure) is achieved. Once the function returns, the caller is responsible for monitoring the status of the termination process via native SageMaker APIs or the AWS console. If synchronous is False, this parameter is ignored.