Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

MLOps management agent

Availability information

The MLOps management agent is off by default. Contact your DataRobot representative or administrator for information on enabling the feature for DataRobot MLOps.

Feature flag: Enable MLOps management agent

Now available as a public preview feature, the management agent provides a standard mechanism to automate model deployment to any type of infrastructure. It pairs automated deployment with automated monitoring to ease the burden on remote models in production, especially with critical MLOps features such as challengers and retraining. The agent, accessed from the DataRobot application, ships with an assortment of plugins that support custom configuration.

Management agent setup

To configure the management agent, you must prepare its various components, detailed below:

  • Register a prediction environment
  • Download the agent tarball
  • Select an environment plugin
  • Configure the management agent
  • Create a deployment

Register a prediction environment

You can use the management agent with a prediction environment to automate the deployment, replacement, and monitoring of models using the prediction environment. Management agent setup begins with configuring a prediction environment to use with deployments. Before proceeding, register the prediction environment with DataRobot.

Once registered, navigate to Deployments > Prediction Environments. Select the prediction environment to use from the list and toggle on Use Management Agent.

Once enabled, you must indicate the email address for the management agent service account holder. DataRobot recommends using an administrative service account as the account holder (an account that has access to each deployment that uses the configured prediction environment).

Download the agent

Access the management agent by downloading the MLOps agent tarball and installing it on the remote environment from which you are hosting models to make predictions. You can download it directly from the DataRobot application by clicking on your user icon and navigating to Developer Tools. Under the External Monitoring Agent header, click the download icon. The tarball appears in your browser's downloads bar when complete.

Select an environment plugin

The tarball includes configurable environment plugins that provide various types of infrastructure used to support remote models. The management agent translates deployment events (model replacement, deployment launch, etc.) into processes for the plugin to run in response to that event. Choose and configure the plugin that best supports your infrastructure. Advanced users can also create plugins, either completely customized or by using the provided plugins as a starting point.

Tip

The tarball includes README files to help with the installation and configuration of the plug-ins.

Kubernetes plugin

For Kubernetes users, DataRobot supplies a plugin that allows you to deploy and manage models in your cluster without having to write any code. For configuration information, see the README file in the tools/charts/datarobot-management-agent folder in the tarball.

Docker plugin

The management agent supports use of the Portable Prediction Server via the Docker plugin supplied in the agent tarball. It allows a single Docker container to serve multiple models. This plugin also allows you to configure PPS to indicate where models for each deployment are located and start or stop deployments and manage them like any other plugin.

The Docker plugin can:

  • Retrieve a model package from DataRobot for deployment.
  • Launch the DataRobot model within the docker container.
  • Shut down and clean up the Docker container.
  • Report status back via events.
  • Monitor predictions using the MLOps agent.

Configure the agent

After downloading the tarball and configuring an agent plugin, edit the agent's config file:

  • Provide access to DataRobot (provide your API key and DataRobot username).
  • Represent the prediction environment name so the management agent can access it and any associated deployments.
  • Indicate the management agent plugin to use.

Create a deployment

After configuring the prediction environment and the management agent for use, you can create an external deployment with events monitored by the agent. The deployment must use the prediction environment configured in the steps above in order to support the agent's monitoring functionality. To do so, DataRobot recommends registering an external model package and deploying it from the Model Registry.

Once deployed, you have a deployment fully configured with the management agent, capable of monitoring deployment events and automating actions in response to those events.

Overview of deployment events

The management agent sends periodic updates about deployment health and status via the API. These are reported as MLOps events and are listed on the Service Health page.

DataRobot allows you to monitor and work with deployment events for external deployments once set up with the management agent. From one place, you can:

Action Example use case
Record and persist deployment-related events Record deployment actions, health changes, state changes, etc.
View all related events Auditing deployment events
Filter and search events View all model changes
Extract data Reporting and offline storage
Receive notification of certain incidents Receive a Slack message for an outage
Enforce a retention policy 90 days of retention guaranteed, but older events may be purged

To view an overview of deployment events, select the deployment from the inventory and navigate to the Service Health tab. All events are recorded under the Recent Management Agent Activity section.

The most recent events are listed at the top of the list. Each event shows the time it occurred, a description, and an icon indicating its status:

Icon Description
Green / Passing No action needed
Yellow / At risk Concerns found but no immediate action needed, continue monitoring
Red / Failing Immediate action needed
Gray / Unknown Unknown
Informational Details a deployment action (e.g., the deployment has launched)

Note that the management agent's most recently reported service health status is prioritized. For example, if data drift is green and passing on a deployment, but the management agent delivers an inferior status (red and failing), the list updates to reflect that condition.

Select an event row to view its details on the right-side panel.


Updated November 5, 2021
Back to top