The following steps describe how to deploy the MLOps agent on Google Kubernetes Engine (GKE) with Pub/Sub as a spooler. This allows you to monitor a custom Python model developed outside DataRobot. The custom model will be scored at the local machine and will send the statistics to Google Cloud Platform (GCP) Pub/Sub. Finally, the agent (deployed on GKE) will consume this data and send it back to the DataRobot MLOps dashboard.
DataRobot MLOps offers the ability to monitor all your ML models (trained in DataRobot or outside) in a centralized dashboard with the DataRobot MLOps agent. The agent, a Java utility running in parallel with the deployed model, can monitor models developed in Java, Python, and R programming languages.
The MLOps agent communicates with the model via a spooler (i.e., file system, GCP Pub/Sub, AWS SQS, or RabbitMQ) and sends model statistics back to the MLOps dashboard. These can include the number of scored records, number of features, scoring time, data drift, and more. You can embed the agent can into a Docker image and deploy it on a Kubernetes cluster for scalability and robustness.
You will be asked to choose an existing project or create a new one, as well as to select the compute zone.
Install the Kubernetes command-line tool:
gcloud components install kubectl
Retrieve your Google Cloud service account credentials to call Google Cloud APIs. If you don’t have a default service account, you can create it by following this procedure.
Once credentials are in place, download the JSON file that contains them. Later, when it is time to pass your credentials to the application that will call Google Cloud APIs, you can use one of these methods:
Go to your Google Cloud console Pub/Sub service and create a topic (i.e., a named resource where publishers can send messages).
Create a subscription—a named resource representing the stream of messages from a single, specific topic, to be delivered to the subscribing application. Use the Pub/Sub topic from the previous step and set Delivery type to Pull. This provides a Subscription ID.
Additionally, you can configure message retention duration and other parameters.
While technically an optional step, best practice advises always testing your image locally to save time and network bandwidth.
The monitoring agent tarball includes the necessary library (along with Java and R libraries) for sending statistics from the custom Python model back to MLOps. You can find the libraries in the lib directory.
You will need the JSON file with credentials that you downloaded in the prerequisites (the step that describes downloading Google Cloud account credentials).
The following is the example of the Python code where your model is scored (all package import statements are omitted from this example):
fromdatarobot_mlops.mlopsimportMLOpsDEPLOYMENT_ID="EXTERNAL-DEPLOYMENT-ID-DEFINED-AT-STEP-1"MODEL_ID="EXTERNAL-MODEL-ID-DEFINED-AT-STEP-1"PROJECT_ID="YOUR-GOOGLE-PROJECT-ID"TOPIC_ID="YOUR-PUBSUB-TOPIC-ID-DEFINED-AT-STEP-2"# MLOPS: initialize the MLOps instancemlops=MLOps() \
.set_deployment_id(DEPLOYMENT_ID) \
.set_model_id(MODEL_ID) \
.set_pubsub_spooler(PROJECT_ID,TOPIC_ID) \
.init()# Read your custom model pickle file (model has been trained outside DataRobot)model=pd.read_pickle('custom_model.pickle')# Read scoring datafeatures_df_scoring=pd.read_csv('features.csv')# Get predictionsstart_time=time.time()predictions=model.predict_proba(features_df_scoring)predictions=predictions.tolist()num_predictions=len(predictions)end_time=time.time()# MLOPS: report the number of predictions in the request and the execution timemlops.report_deployment_stats(num_predictions,end_time-start_time)# MLOPS: report the features and predictionsmlops.report_predictions_data(features_df=features_df_scoring,predictions=predictions)# MLOPS: release MLOps resources when finishedmlops.shutdown()
Set the GOOGLE_APPLICATION_CREDENTIALS environment variable:
After you have tested and validated the container image locally, upload it to a registry so that your Google Kubernetes Engine (GKE) cluster can download and run it.
Configure the Docker command-line tool to authenticate to Container Registry:
gcloud auth configure-docker
Push the Docker image you built to the Container Registry:
After storing the Docker image in the Container Registry, you next create a GKE cluster, as follows:
Set your project ID and Compute Engine zone options for the gcloudtool:
gcloud config set project $PROJECT_ID
gcloud config set compute/zone europe-west1-b
Create a cluster.
Note
This example, for simplicity, creates a private cluster with unrestricted access to the public endpoint. For security, be sure to restrict access to the control plane for your production environment. Find detailed information about configuring different GKE private clusters here.
The MLOps agent running on a GKE private cluster needs access to the DataRobot MLOps service. To do this, you must give the private nodes outbound access to the internet, which you can achieve using a NAT cloud router (Google documentation here).
With the cloud router configured, you can now create K8s ConfigMaps to contain the MLOps agent configuration and Google credentials. You will need the downloaded JSON credentials file created during the prerequisites stage.
Note
Use K8s Secrets to save your configuration files for production usage.