Integrations
>
Azure
>Deploy and monitor DataRobot models in Azure Kubernetes Service
Deploy and monitor DataRobot models in Azure Kubernetes Service¶
Availability information
The MLOps model package export feature is off by default. Contact your DataRobot representative or administrator for information on enabling this feature for DataRobot MLOps.
Feature flag: Enable MMM model package export
This page shows how to deploy machine learning models on Azure Kubernetes Services (AKS) to create production scoring pipelines with DataRobot's MLOps Portable Prediction Server (PPS).
DataRobot Automated Machine Learning provides a dedicated prediction server as a low-latency, synchronous REST API suitable for real-time predictions. The DataRobot MLOps PPS extends this functionality to serve ML models in container images, giving you portability and control over your ML model deployment architecture.
A containerized PPS is well-suited to deployment in a Kubernetes cluster, allowing you to take advantage of this deployment architecture's auto-scaling and high availability. The combination of PPS and Kubernetes is ideal for volatile, irregular workloads such as those you can find in IoT use cases.
For more information on how to structure this Docker command, see the Docker build documentation.
For the COPY command to work, you must have the .mlpkg file in the same directory as the Dockerfile. After creating your Dockerfile, run the command below to create a new image that includes the model:
To push your new image to Azure Container Registry (ACR), log in with the following command (replace <DOCKER_USERNAME> with your previously-selected repository name):
dockerlogin<DOCKER_USERNAME>.azurecr.io
The password is the administrator password you created with the Azure Container Registry (ACR).
Once logged in, make sure your Docker image is correctly tagged, and then push it to the repo with the following command (replace <DOCKER_USERNAME> with your previously selected repository name):
Create a secret Docker registry so that AKS can pull images from the private repository. In the command below, replace the following with your actual credentials:
Deploy your Portable Prediction Server image. There are many ways to deploy applications, but the easiest method is via the Kubernetes dashboard. Start the Kubernetes dashboard with the following command:
To test the model, download the DataRobot PPS Examples a Postman Collection, and update the hostname from localhost to the external IP address assigned to your service. You can find the IP address in the Services tab on your Kubernetes dashboard:
To make a prediction, execute the make predictions request:
Kubernetes supports horizontal pod autoscaling to adjust the number of pods in a deployment depending on CPU utilization or other selected metrics. The Metrics Server provides resource utilization to Kubernetes and is automatically deployed in AKS clusters.
In the previous sections, you deployed one pod for your service and defined only the minimum requirement for CPU and memory resources.
To use the autoscaler, you must define CPU requests and utilization limits.
By default, the Portable Prediction Server spins up one worker, which means it can handle only one HTTP request simultaneously. The number of workers you can run, and thus the number of HTTP requests that it can handle simultaneously, is tied to the number of CPU cores available for the container.
Because you set the minimum CPU requirement to 1, you can now set the limit to 2 in the patchSpec.yaml file:
This enables Kubernetes to autoscale the number of pods in the portablepredictionserver deployment. If the average CPU utilization across all pods exceeds 50% of their requested usage, the autoscaler increases the pods from a minimum of one instance up to ten instances.
To run a load test, download the sample JMeter test plan below and update the URLs/ authentication. Run it with the following command:
jmeter-n-tLoadTesting.jmx-lresults.csv
The output will look similar to the following example:
Report usage to DataRobot MLOps via monitoring agents¶
After deploying your model to AKS, you can monitor this model, along with all of your other models, in one central dashboard by reporting the telemetrics for these predictions to your DataRobot MLOps server and dashboards.
Navigate to the Model Registry > Model Packages > Add New Package and follow the instructions in the documentation.
Select Add new external model package and specify a package name and description (1 and 2), upload the corresponding training data for drift tracking (3), and identify the model location (4), target (5), environment (6), and prediction type (7), then click Create package (8).
After creating the external model package, note the model ID in the URL as shown below (blurred in the image for security purposes).
While still on the Model Registry page and within the expanded new package, select the Deployments tab and click Create new deployment.
The deployment page loads prefilled with information from the model package you created.
Complete any missing information for the deployment and click Create deployment.
Navigate to Deployments > Overview and copy the deployment ID (from the URL).
Now that you have your model ID and deployment ID, you can report predictions as described in the next section.
Even though you deployed a model outside of DataRobot on a Kubernetes cluster (AKS), you can monitor it like any other model and track service health and data drift in one central dashboard (see below).