Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Monitor SageMaker models in MLOps

This topic outlines how to monitor an AWS SageMaker model developed and deployed on AWS for real-time API scoring. DataRobot can monitor models through a remote agent architecture that does not require a direct connection between an AWS model and DataRobot. This topic explains how to add data to a monitoring queue. See the Monitor with serverless MLOps agents topic to learn how to consume data from a queue.

Technical architecture

You can construct the deployment architecture above by following the steps in this article. Review details of each component in the list below:

  1. An API client assembles a single-line JSON request of raw data input for scoring, which is posted to an API Gateway-exposed endpoint.
  2. The API Gateway acts as a pass-through and submits the request to an associated Lambda function for handling.
  3. Logic in the Lambda processes the raw input data and parses it into the format required to score through the SageMaker endpoint. This example parses the data into a headerless CSV for an XGBoost model. The SageMaker endpoint is then invoked.
  4. The SageMaker endpoint satisfies the request by passing it to a standing deployed EC2 instance hosting the real-time model. The model deployment line from the AWS code in Community AI Engineering GitHub repo (xgb.deploy) handles standing up this machine and bringing the trained AWS ECR hosted model to it.
  5. The raw score is processed by the Lambda; in this example, a threshold is applied to select a binary classification label.
  6. Timing, input data, and model results are written to an SQS queue.
  7. The processed response is sent back to the API Gateway.
  8. The processed response is passed back to the client.

Create a custom SageMaker model

This article is based on the SageMaker notebook example in the AWS GitHub repo.

The use case aims to predict which customers will respond positively to a direct marketing campaign. The code has been updated to confirm with the v2 version of the SageMaker SDK and can be found in the DataRobot Community AI Engineering GitHub repo.

Completing the notebook in AWS SageMaker will result in a deployed model at a SageMaker endpoint named xgboost-direct-marketing hosted on a standing ml.m4.xlarge instance.


The endpoint expects fully prepared and preprocessed data (one-hot encoding applied, for example) in the same order it was provided in during training.

There are several ways to test the SageMaker endpoint; the snippet below is a short Python script that can score a record from the validation set (the target column has been dropped).

import boto3
import os
import json
runtime = boto3.Session().client('sagemaker-runtime',use_ssl=True)
endpoint_name = 'xgboost-direct-marketing'

payload = '29,2,999,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,0'

response = runtime.invoke_endpoint(EndpointName=endpoint_name,
result = json.loads(response['Body'].read())

Create an external deployment in DataRobot

Inside DataRobot, a deployment entry must be created to monitor the SageMaker model. Data and statistics are reported to this deployment for processing, visualization, and analysis. To create one:

  1. Navigate to Model Registry > Model Packages tab.
  2. Click Add New Package.
  3. Select New external model package.

  4. Provide the required information as shown below.

  5. Download the bank-additonal-full.csv file from Community GitHub.

  6. Upload bank-additional-full.csv as the training dataset. Although this is an example, it is important to note that the model in this example is not trained on 100% of the data before full deployment. In real-world machine learning, this is a good practice to consider.
  7. Click Create package to complete creation of the model package and add it to the Model Registry.
  8. Locate the package in the registry and click on the Actions menu for the package.

  9. Toggle on Enable target monitoring and Enable feature drift tracking and select Deploy model.

    You can configure additional prediction environment metadata if needed, providing the details of where the external model resides (AWS in this case).

    Upon completion of deployment creation, some ID values need to be retrieved. These will be associated with the model in SageMaker.

  10. Under the deployment, navigate to the Predictions > Monitoring tab, and view the Monitoring Code. Copy the values for MLOPS_DEPLOYMENT_ID and MLOPS_MODEL_ID.


Note that the MLOPS_DEPLOYMENT_ID is associated with the entry within model monitoring, while the MLOPS_MODEL_ID is an identifier provided for the actual scoring model behind it. The MLOPS_DEPLOYMENT_ID should stay static. However, you may replace the SageMaker model at some point. If so, one of two actions should be taken:

  • Create a completely new external deployment in DataRobot following the same steps as above.
  • Register a new model package and replace the model currently hosted at this deployment with that new package.

In this scenario, a new MLOPS_MODEL_ID is assigned and used to update the Lambda environment variables. Additionally, the same MLOPS_DEPLOYMENT_ID entry in DataRobot will show statistics for both models under the same entry, and note when the change occurred.

Create an IAM role

Lambda functions use a role to score data through the SageMaker endpoint. To create a role:

  1. Navigate to IAM Service within the AWS Console.
  2. Click Create Role and choose Lambda for the use case. Then, click Next: Permissions.
  3. Select Create Policy, choose the JSON tab, and paste the following snippet:

        "Version": "2012-10-17",
        "Statement": [
                "Sid": "VisualEditor0",
                "Effect": "Allow",
                "Action": "sagemaker:InvokeEndpoint",
                "Resource": "*"
  4. Select Review policy and name the policy lambda_sagemaker_execution_policy.

  5. Click Create policy.
  6. You can now attach the policy to the role created in the steps above. Select the refresh button and filter on the string sagemaker.
  7. Select the policy from the list.
  8. Click Next: Tags, set any desired tags, and select Next: Review.
  9. Name the role lambda_sagemaker_execution_role and click Create role.
  10. This role requires additional resources so that the Lambda can send reporting data to an SQS queue. See Monitor with serverless MLOps Agents to create a queue to receive reporting data. The queue created in that topic, sqs_mlops_data_queue, is also used here.
  11. To add the additional resources, view the IAM role lambda_sagemaker_execution_role.
  12. Select Add inline policy, and perform a search for the SQS service.
  13. Select List, Read, and Write access levels.
  14. Optionally, deselect ReceiveMessage under the Read heading so that this role does not move items off the queue.
  15. Expand Resources to limit the role to only use the specific data queue and populate the ARN of the queue.

  16. Click Review policy and name it lambda_agent_sqs_write_policy.

  17. To complete the policy, select Create policy.


    Note that additional privileges are required to allow Lambda to write log entries to CloudWatch.

  18. Select Attach policies.

  19. Filter on AWSLambdaBasicExecutionRole, select that privilege, and click Attach policy. The completed permissions for the role should look similar to the example below.

Create a layer for the Lambda

Lambda allows for layers: additional libraries that can be used by Lambda at runtime.

  1. Download the MLOps agent library from the DataRobot application in the Profile > Developer Tools menu.


    The package used in this example is datarobot_mlops_package-6.3.3-488. It includes numpy and pandas as well. These packages were also used in data prep for the model, and the same code is used in the Lambda function.

  2. The Lambda environment is Python 3.7 on Amazon Linux. To ensure a layer will work with the Lambda, you can first create one on an Amazon Linux EC2 instance. Instructions to install Python 3 on Amazon Linux are available here.

  3. Once the model package is on the server, perform the following steps.

    gunzip datarobot_mlops_package-6.3.3-488.tar.gz
    tar -xvf datarobot_mlops_package-6.3.3-488.tar
    cd datarobot_mlops_package-6.3.3
    python3 -m venv my_agent/env
    source my_agent/env/bin/activate
    pip install lib/datarobot_mlops-*-py2.py3-none-any.whl
    cd my_agent/env
    mkdir -p python/lib/python3.7/site-packages
    cp -r lib/python3.7/site-packages/* python/lib/python3.7/site-packages/.
    zip -r9 ../ python
    cd ..
    aws support s3 cp s3://some-bucket/layers/
  4. In AWS, navigate to Lambda > Additional resources > Layers.

  5. Select Create layer.
  6. Name the layer python37_agent633_488 and optionally choose a Python 3.7 runtime.
  7. Select Upload a file from S3 and provide the S3 address of the file, s3://some-bucket/layers/
  8. Select Create layer to save the configuration.

Create a Lambda

The following steps outline how to create a Lambda that calls the SageMaker runtime invoke endpoint. The endpoint accepts raw, ready-to-score data. However, this method is not friendly to API clients. The following example shows a record that is ready to be scored:


Another example can be found on AWS here. DataRobot recommends performing data prep on the client application.

Create a Lambda to process the actual data used by a client to make it ready for scoring. The returned score will be decoded as well, making it much friendlier for calling applications. To do so:

  1. Navigate to the AWS Lambda service in the console.
  2. Click Create function.
  3. Select Author > From scratch.
  4. Choose the Python 3.7 runtime.
  5. Under Permissions, choose the default execution role to be lambda_sagemaker_execution_role.
  6. Name the function lambda-direct-marketing and select Create function.
  7. On the next screen, edit the environment variables as desired.
  8. Replace the values as appropriate for the MLOPS_DEPLOYMENT_ID and MLOPS_MODEL_ID variables.
  9. Provide the URL for the AWS SQS queue to use as a reporting channel.

    Name Value
    ENDPOINT_NAME xgboost-direct-marketing
    MLOPS_DEPLOYMENT_ID 1234567890
    MLOPS_MODEL_ID 12345

    The Lambda designer window also has a location for selecting layers.

  10. Choose this box and then select Add a layer from the layers form.

  11. Select Custom layers and choose the created layer.


    Only layers that have a runtime matching the Lambda runtime show up in this list, although a layer can be explicitly chosen by the Amazon Resource Name if you opt to specify one.

  12. Use the following code for the Lambda body:

import os
import io
import boto3
import json
import csv
import time
import pandas as pd
import numpy as np
from datarobot.mlops.mlops import MLOps

# grab environment variables
runtime= boto3.client('runtime.sagemaker')

def lambda_handler(event, context):
    # this is designed to work with only one record, supplied as json

    # start the clock
    start_time = time.time()

    # parse input data
    print("Received event: " + json.dumps(event, indent=2))
    parsed_event = json.loads(json.dumps(event))
    payload_data = parsed_event['data']
    data = pd.DataFrame(payload_data, index=[0])
    input_data = data

    # repeat data steps from training notebook
    data['no_previous_contact'] = np.where(data['pdays'] == 999, 1, 0)                                 # Indicator variable to capture when pdays takes a value of 999
    data['not_working'] = np.where(np.in1d(data['job'], ['student', 'retired', 'unemployed']), 1, 0)   # Indicator for individuals not actively employed
    model_data = pd.get_dummies(data)     
    model_data = model_data.drop(['duration', 'emp.var.rate', 'cons.price.idx', 'cons.conf.idx', 'euribor3m', 'nr.employed'], axis=1)

    # xgb sagemaker endpoint features
    # order/type required as was deployed in sagemaker notebook
    model_features = ['age', 'campaign', 'pdays', 'previous', 'no_previous_contact',
       'not_working', 'job_admin.', 'job_blue-collar', 'job_entrepreneur',
       'job_housemaid', 'job_management', 'job_retired', 'job_self-employed',
       'job_services', 'job_student', 'job_technician', 'job_unemployed',
       'job_unknown', 'marital_divorced', 'marital_married', 'marital_single',
       'marital_unknown', 'education_basic.4y', 'education_basic.6y',
       'education_basic.9y', '', 'education_illiterate',
       'education_professional.course', '',
       'education_unknown', 'default_no', 'default_unknown', 'default_yes',
       'housing_no', 'housing_unknown', 'housing_yes', 'loan_no',
       'loan_unknown', 'loan_yes', 'contact_cellular', 'contact_telephone',
       'month_apr', 'month_aug', 'month_dec', 'month_jul', 'month_jun',
       'month_mar', 'month_may', 'month_nov', 'month_oct', 'month_sep',
       'day_of_week_fri', 'day_of_week_mon', 'day_of_week_thu',
       'day_of_week_tue', 'day_of_week_wed', 'poutcome_failure',
       'poutcome_nonexistent', 'poutcome_success']

    # create base generic single row to score with defaults
    feature_dict = { i : 0 for i in model_features }
    feature_dict['pdays'] = 999

    # get column values from received and processed data
    input_features = model_data.columns

    # replace value in to be scored record, if input data provided a value
    for feature in input_features:
        if feature in feature_dict:
            feature_dict[feature]  = model_data[feature]

    # make a csv string to score
    payload = pd.DataFrame(feature_dict).to_csv(header=None, index=False).strip('\n').split('\n')[0]
    print("payload is:" + str(payload))

    # stamp for data prep
    prep_time = time.time()
    print('data prep took: ' + str(round((prep_time - start_time) * 1000, 1)) + 'ms')

    response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,

    # process returned data
    pred = json.loads(response['Body'].read().decode())
    #pred = int(result['predictions'][0]['score'])

    # if scored value is > 0.5, then return a 'yes' that the client will subscribe to a term deposit
    predicted_label = 'yes' if pred >= 0.5 else 'no'

    # initialize mlops monitor
    m = MLOps().init()

    # MLOPS: report test features and predictions and association_ids
        , class_names = ['yes', 'no']
        , predictions = [[pred, 1-pred]]    # yes, no

    # report lambda timings (excluding lambda startup and imports...)
    # MLOPS: report deployment metrics: number of predictions and execution time
    end_time = time.time()
    m.report_deployment_stats(1, (end_time - start_time) * 1000)

    print("pred is: " + str(pred))
    print("label is: " + str(predicted_label))

    return predicted_label

Test the Lambda

  1. Click Configure test > events in the upper right corner of the Lambda screen to configure a test JSON record.

  2. Use the following JSON record format:

      "data": {
        "age": 56,
        "job": "housemaid",
        "marital": "married",
        "education": "basic.4y",
        "default": "no",
        "housing": "no",
        "loan": "no",
        "contact": "telephone",
        "month": "may",
        "day_of_week": "mon",
        "duration": 261,
        "campaign": 1,
        "pdays": 999,
        "previous": 0,
        "poutcome": "nonexistent",
        "emp.var.rate": 1.1,
        "cons.price.idx": 93.994,
        "cons.conf.idx": -36.4,
        "euribor3m": 4.857,
        "nr.employed": 5191
  3. Select Test to score a record through the Lambda service and SageMaker endpoint.

Resource settings and performance considerations

Serverless computational resources are allocated from 128MB to 10240MB, which you can change on the Lambda console under Basic settings. This results in the allocation of a partial vCPU to six full vCPUs during each Lambda run. Lambda cold and warm starts and the EC2 host sizing and scaling for the SageMaker endpoint are beyond the scope of this topic, but the resources for the Lambda itself impacts pre- and post-scoring processing and overall Lambda performance.

128MB for this code will produce noticeably slower processing times, although diminishing returns are to be expected as RAM and CPO are upsized. For this example, 1706MB (and one full vCPU) provided good results.

Expose the Lambda via API Gateway

  1. Navigate to the API Gateway service in AWS and click Create API.
  2. Choose to build a REST API, name it lambda-direct-marketing-api.
  3. Click Create API again.
  4. Under the Resources section of the entry, choose Actions -> Create Resource.
  5. Name it predict and select Create Resource.
  6. Highlight the resource, choose Actions -> Create Methods and select a POST method.
  7. Choose the Integration Type Lambda Function and the Lambda Function lambda-direct-marketing, then click Save.


    You can select the TEST button on the client if the same payload was used in the Lambda test event (see Test the Lambda).

  8. Next, choose Actions > Deploy API, choose a Stage name (e.g., "test"), and click Deploy.

    The model is now deployed and available via the Invoke URL provided after deployment.

Test the exposed API

The same test record used above (for Test the Lambda) can be used to score the model via an HTTP request. Below is an example of doing so using curl and an inline JSON record.

Expected no

curl -X POST " <> " --data '{"data": {"age": 56,
"job": "housemaid", "marital": "married", "education": "basic.4y", "default": "no", "housing": "no", "loan": "no",
"contact": "telephone", "month": "may", "day\_of\_week": "mon", "duration": 261, "campaign": 1, "pdays": 999,
"previous": 0, "poutcome": "nonexistent", "emp.var.rate": 1.1, "cons.price.idx": 93.994, "cons.conf.idx": -36.4,
"euribor3m": 4.857, "nr.employed": 5191}}'

Expected yes

curl -X POST " <> " --data '{"data": {"age": 34,
"job": "blue-collar", "marital": "married", "education": "", "default": "no", "housing": "yes", "loan": "no",
"contact": "cellular", "month": "may", "day\_of\_week": "tue", "duration": 863, "campaign": 1, "pdays": 3, "previous":
2, "poutcome": "success", "emp.var.rate": -1.8, "cons.price.idx": 92.893, "cons.conf.idx": -46.2, "euribor3m": 1.344,
"nr.employed": 5099.1}}'

Review and monitor the deployment in DataRobot

Once data reports from the data queue back to DataRobot, the external model contains metrics relevant to the model and its predictions. You can select the deployment from the DataRobot UI to view operational service health:

You can also view data drift metrics that compare data in scoring requests to that of the original training set.

Not only can DataRobot be used to build, host, and monitor its own models (with its own resources or deployed elsewhere), but, as shown here, it can also be used to monitor completely custom models created and hosted on external architecture. In addition to service health and drift tracking statistics of unprocessed features, models with association IDs and actual results can be used to track model accuracy as well.

Updated May 11, 2023
Back to top