June 14, 2021
The DataRobot MLOps v7.1 release includes many new features and capabilities, described below.
Release v7.1 provides updated UI string translations for the following languages:
Introducing Pricing 5.0¶
Pricing 5.0 is the newest plan available to DataRobot users. With this plan, a number of capabilities supporting DataRobot MLOps are introduced:
Each user or organization has a set number of active deployments they can have at one time. The limit is displayed in the Deployment Inventory status tiles. Pricing 5.0 users can filter the leaderboard by active or inactive deployments.
Users who built models in AutoML can download Scoring Code for the model via the model Leaderboard without engaging in the deployment workflow. Previously, downloading Scoring Code made the associated deployment a permanent fixture. Now, these deployments can be deactivated or deleted. Additionally, users can choose to include prediction explanations with their Scoring Code download.
New features and enhancements¶
See details of new deployment features below:
- Now GA: Improved monitoring support for multiclass deployments
- Automatic actuals feedback for time series deployments
- Now GA: Use challenger models with external deployments
The following new deployment features are currently in public beta. Contact your DataRobot representative for information on enabling them:
- Deployment reports
- Reset deployment statistics
- The management agent
- Baseline revisions for external models
New prediction features¶
See details of new prediction features below:
- Batch prediction cloud connectors
The following new prediction features are currently in public beta. Contact your DataRobot representative for information on enabling them:
- Scoring Code in Snowflake
- Include prediction explanations in Scoring Code
- Batch prediction job definitions and scheduling
- MLOps agent: Kafka
- Portable batch predictions
- Batch prediction Parquet support
- Improved batch predictions for custom models
New model registry features¶
See details of new model registry features below:
- Upload environments as prebuilt images
- Now GA: Integrate a Bitbucket Server or GitHub enterprise repository with custom inference models
- Now GA: Custom Inference Anomaly Detection
New governance features¶
See details of new governance features below:
- Feature lists added to governance metadata
New deployment features¶
Release v7.1 introduces the following generally available deployment features.
Improved monitoring support for multiclass deployments¶
Now generally available, multiclass deployments have additional monitoring support. Multiclass deployments offer class-based configuration to modify the data displayed on the Accuracy and Data Drift graphs. Use the class selector to display the desired classes for a deployment. DataRobot provides quick select shortcuts for classes: the five class most common in the training data, the five with the lowest accuracy score, and the five with the greatest amount of data drift. Once specified, the charts on the tab (Accuracy or Data Drift) update to display the selected classes.
Automatic actuals feedback for time series deployments¶
Time series deployments that have indicated an association id can enable the automatic submission of actuals, so that you do not need to submit them manually via the UI or API. Once enabled, actuals can be extracted from the data used to generate predictions. As each prediction request is sent, DataRobot can extract an actual value for a given date. This is because when you send prediction rows to forecast, historical data is included. This historical data serves as the actual values for the previous prediction request.
Challenger models now available for external deployments¶
Deployments in remote prediction environments can use the Challengers tab. Remote models can serve as the champion model, and you can compare them to DataRobot and custom models challengers. If you want to replace the champion model with a challenger, you can also replace the model with a custom or DataRobot challenger model and deploy the new champion to your remote prediction environment.
New public beta deployment features¶
Release v7.1 introduces the following public beta deployment features.
You can now generate a deployment report on-demand, detailing essential information about a deployment's status such as insights about service health, data drift, and accuracy statistic (among many other details). Additionally, you can create a report schedule that acts as a policy to automatically generate deployment reports based on the defined conditions (frequency, time, and day). When the policy is triggered, then a new report is generated and DataRobot sends an email notification to those who have access to the deployment.
Reset deployment analytics¶
Deployments now support the deletion of monitoring data by model or time range. This action is governed by the approval workflow to safeguard against accidental deletion. This feature allows you to remove monitoring data sent inadvertently or during the integration testing phase of deploying a model from the deployment.
The management agent¶
DataRobot is introducing the management agent, which understands the state of a deployment and can automate the task of retrieving artifacts, deploying models, and replacing them externally. The agent is extensible to support a variety of use cases in various model formats and prediction environments. Administrators can configure the management agent in their prediction environments to automate the deployment and replacement of models based on user actions within MLOps. It pairs easily with the MLOps Agent to automatically monitor models and integrate them with additional MLOps functionality such as challenger models. The management agent is a tool for standardizing and automating model deployment.
Baseline revisions for external models¶
Binary classification models deployed to remote environments can now be registered with Holdout data, allowing external deployments to calculate additional drift and accuracy baselines previously only available to models built with DataRobot AutoML. When monitoring accuracy, you can now compare the current accuracy calculation to the baseline at the time of training the model. Additionally, target drift now supports more detailed drift analysis using prediction values prior to the application of the prediction threshold.
New prediction features¶
Release v7.1 introduces the following generally available prediction features.
Batch prediction cloud connectors¶
The batch prediction API now supports connectors specific to Snowflake and Azure Synapse for the ingest and export of data while scoring. The use of JDBC to transfer data can be costly in terms of input/output operations per second (IOPS) and expenses for data warehouses. This adapter reduces the load on database engines during prediction scoring by using cloud storage and bulk insert to create a hybrid JDBC-cloud storage solution.
New public beta prediction features¶
Release v7.1 introduces the following public beta prediction features.
Scoring Code in Snowflake¶
DataRobot Scoring Code now supports execution directly inside of Snowflake using Snowflake’s new Java UDF functionality. This capability removes the need to extract and load data from Snowflake, resulting in a much faster route to scoring large datasets. The Portable Predictions tab for deployments has been tailored to enable this functionality when the deployment is created in a Snowflake prediction environment.
Include prediction explanations in Scoring Code¶
You can now receive prediction explanations anywhere you deploy a model: in DataRobot, with the Portable Prediction Server, and now in Java Scoring Code. Prediction explanations provide a quantitative indicator of the effect variables have on the predictions, answering why a given model made a certain prediction. You can enable prediction explanations on the Portable Predictions tab when downloading a model via Scoring Code.
Batch prediction job definitions and scheduling¶
When making batch predictions for deployments via the Make Predictions tab, you can now create and schedule JDBC and cloud storage prediction jobs directly from the deployment without utilizing the API. Additionally, you can view a history of the prediction jobs that ran. Define the name of the job, the prediction source, configurations, and the prediction destination. All specifications are saved for later use.
New MLOps agent channel: Kafka¶
The MLOps agent now supports Kafka as a channel, in addition to previously supported channels: File, AWS SQS, Google Pub, Google Sub, and RabbitMQ. The agent can now be easily deployed to many prediction environments and Kafka support, eliminates the need for additional queuing services.
Portable batch predictions¶
The Portable Prediction Server can now be paired with an additional container to orchestrate batch predictions jobs using file storage, JDBC, and cloud storage. You no longer need to manually manage the large scale batching of predictions while utilizing the Portable Prediction Server. Additionally, large batch predictions jobs can be collocated at or near the data, or in environments behind firewalls without access to the public Internet
Batch prediction parquet support¶
The batch prediction API has been enhanced to support Parquet-formatted files for both ingest and output. Parquet file format support removes the need to implement an additional conversion step in prediction pipelines.
Improved batch predictions for custom models¶
For custom model deployments, batch prediction replica configuration enhances performance and stabilizes large prediction jobs.
New model registry features¶
Release v7.1 introduces the following generally available model registry features.
Upload environments as prebuilt images¶
You can now upload environments for custom models as prebuilt images. This image is a Docker image saved as a tarball in .tar, .gz, or .tgz format. If you provide a prebuilt image, you do not need to provide a context file for the environment (the tarball archive containing the Dockerfile and any other relevant files). If you supply a prebuilt image or build an environment with a context file, you can then download the built environment image as a .tar file.
GitHub Enterprise and Bitbucket Server integration for custom models¶
Users can now register GitHub Enterprise and Bitbucket Server repositories in the Model Registry to pull artifacts into DataRobot and build custom inference models. Integrating either of these repositories allows you to directly transfer between a governed, code-centric machine learning development environment and a governed MLOps environment.
Custom inference anomaly detection models¶
Now available generally available, you can create a custom inference model for anomaly detection problems. When creating a custom model, you can indicate "Anomaly Detection" as a target type. Additionally, access the DRUM template for anomaly detection models. For deployed custom inference anomaly detection models, note that the following functionality is not supported:
- Data drift
- Accuracy and association IDs
- Challenger models
- Humility rules
- Prediction intervals
New public beta governance features¶
Release v7.1 introduces the following public beta features.
Feature lists added to governance metadata¶
The Model Registry and deployments have been enhanced to allow you to view a model's feature list and feature importance. You can now access this metadata without navigating back into the original modeling project to understand the full list of features for their model.