February 22, 2023
This page provides announcements of newly released features available in DataRobot's SaaS single- and multi-tenant AI Platform, with links to additional resources. With the February deployment, DataRobot's AI Platform delivered the following new GA and Public Preview features. From the release center you can also access:
Features grouped by capability
Quick Autopilot improvements now available for time series¶
With this month’s release, Quick Autopilot has been streamlined for time series projects, speeding experimentation. In the new version of Quick, to maximize runtime efficiency, DataRobot no longer automatically generates and fits the DR Reduced Features list, as fitting requires retraining models. Models are still trained at the maximum sample size for each backtest, defined by the project’s date/time partitioning. The specific number of models run varying by project and target type. See the documentation on the model recommendation process for alternate methods to build a reduced feature list.
Retraining Combined Models now faster¶
Now generally available, time series segmented models now support retraining on the same feature list and blueprint as the original model without the need to rerun Autopilot or feature reduction. Previously, rerunning Autopilot was the only way to retrain this model type. This new support creates parity in retraining between retraining a non-segmented time series model and a segmented model. Because the improvement ensures that retraining leverages the feature reduction computations from the original, only newly introduced features need to go through that process, saving time and adding flexibility. Note that retraining retrains the champion of a segment, it does not rerun the project and select a new champion.
Python and Java Scoring Code snippets¶
Now generally available, DataRobot allows you to use Scoring Code via Python and Java. Although the underlying Scoringe Code is based off Java, DataRobot now provides the DataRobot Prediction Library to make predictions using various prediction methods supported by DataRobot via a Python API. The library provides a common interface for making predictions, making it easy to swap out any underlying implementation. Access Scoring Code for Python and Java from a model in the Leaderboard or from a deployed model that supports Scoring Code.
Export deployment data¶
Now generally available, on a deployment’s Data Export tab, you can export stored training data, prediction data, and actuals to compute and monitor custom business or performance metrics on the Custom Metrics tab or outside DataRobot. You can export the available deployment data for a specified model and time range. To export deployment data, make sure your deployment stores prediction data, generate data for the required time range, and then view or download that data.
The initial release of the deployment data export feature enforces some row count limitations. For details, review the considerations in the feature documentation.
For more information, see the Data Export tab documentation.
Create custom metrics¶
Now generally available, on a deployment's Custom Metrics tab, you can use the data you collect from the Data Export tab (or data calculated through other custom metrics) to compute and monitor up to 25 custom business or performance metrics. After you add a metric and upload data, a configurable dashboard visualizes a metric’s change over time and allows you to monitor and export that information. This feature enables you to implement your organization's specialized metrics to expand on the insights provided by DataRobot's built-in Service Health, Data Drift, and Accuracy metrics.
The initial release of the custom metrics feature enforces some row count and file size limitations. For details, review the considerations in the feature documentation.
For more information, see the Custom Metrics tab documentation.
Drill down on the Data Drift tab¶
Now generally available on the Data Drift tab, the new Drill Down visualization tracks the difference in distribution over time between the training dataset of the deployed model and the datasets used to generate predictions in production. The drift away from the baseline established with the training dataset is measured using the Population Stability Index (PSI). As a model continues to make predictions on new data, the change in the drift status over time is visualized as a heat map for each tracked feature. This heat map can help you identify data drift and compare drift across features in a deployment to identify correlated drift trends:
In addition, you can select one or more features from the heat map to view a Feature Drift Comparison chart, comparing the change in a feature's data distribution between a reference time period and a comparison time period to visualize drift. This information helps you identify the cause of data drift in your deployed model, including data quality issues, changes in feature composition, or changes in the context of the target variable:
For more information, see the Drill down on the Data Drift tab documentation.
Monitor deployment data processing¶
Now generally available, the Usage tab reports on prediction data processing for the Data Drift and Accuracy tabs. Monitoring a deployed model’s data drift and accuracy is a critical task to ensure that model remains effective; however, it requires processing large amounts of prediction data and can be subject to delays or rate limiting. The information on the Usage tab can help your organization identify these data processing issues. The Prediction Tracking chart, a bar chart of the prediction processing status over the last 24 hours or 7 days, tracks the number of processed, rate-limited, and missing association ID prediction rows:
On the right side of the page are the processing delays for Predictions Processing (Champion) and Actuals Processing (the delay in actuals processing is for ALL models in the deployment):
For more information, see the Usage tab documentation.
Deployment creation workflow redesign¶
Now generally available, the redesigned deployment creation workflow provides a better organized and more intuitive interface. Regardless of where you create a new deployment (the Leaderboard, the Model Registry, or the Deployments inventory), you are directed to this new workflow. The new design clearly outlines the capabilities of your current deployment based on the data provided, grouping the settings and capabilities logically and providing immediate confirmation when you enable a capability, or guidance when you’re missing required fields or settings. A new sidebar provides details about the model being used to make predictions for your deployment, in addition to information about the deployment review policy, deployment billing details (depending on your organization settings), and a link to the deployment information documentation.
For more information, see the Configure a deployment documentation.
Connect to Snowflake using external OAuth¶
Now generally available, Snowflake users can set up a Snowflake data connection in DataRobot using an external identity provider (IdP)—either Okta or Azure Active Directory— for user authentication through OAuth single sign-on (SSO).
For more information, see the Snowflake External OAuth documentation.
Add custom logos to No-Code AI Apps¶
Now generally available, you can add a custom logo to your No-Code AI Apps, allowing you to keep the branding of the AI App consistent with that of your company before sharing it either externally or internally.
To upload a new logo, open the application you want to edit and click Build. Under Settings > Configuration Settings, click Browse and select a new image, or drag-and-drop an image into the New logo field.
For more information, see the No-Code AI App documentation.
Multiclass support in No-Code AI Apps¶
No-Code AI Apps now support multiclass classification deployments across all three template types—Predictor, Optimizer, and What-If. This gives users the ability to create applications that solve a broader range of business problems.
Sliced insights show a subpopulation of model data¶
Now available as public preview, slices allow you to define filters for categorical, numeric, or both types of features. Viewing and comparing insights based on segments of a project’s data helps to understand how models perform on different subpopulations. You can also compare a slice against the “global” slice--all training data (depending on the insight). Configuring a slice allows you to choose a feature and set operators and values to narrow the data returned.
Sliced insights are available for Lift Chart, ROC Curve, Residual, and Feature Impact visualizations.
Required feature flag: Enable Sliced Insights
Public preview documentation.
Period Accuracy allows focus on specific periods in training data¶
Available as public preview for OTV and time series projects, the Period Accuracy insight lets you define periods within your dataset and then compare their metric scores against the metric score of the model as a whole. Periods are defined in a separate CSV file that identifies rows to group based on the project’s data/time feature.
Once uploaded, and with the insight calculated, DataRobot provides a table of period-based results and an “over time” histogram for each period.
Required feature flag: Period Accuracy Insight
Public preview documentation.
View Service Health and Accuracy history¶
Now available as a public preview feature, when analyzing a deployment's Service Health and Accuracy, you can view the History tab, providing critical information about the performance of current and previously deployed models. This tab improves the usability of service health and accuracy analysis, allowing you to view up to five models in one place and on the same scale, making it easier directly compare model performance.
On a deployment's Service Health > History tab, you can access visualizations representing the service health history of up to five of the most recently deployed models, including the currently deployed model. This history is available for each metric tracked in a model's service health, helping you identify bottlenecks and assess capacity, which is critical to proper provisioning.
On a deployment's Accuracy > History tab, you can access visualizations representing the accuracy history of up to five of the most recently deployed models, including the currently deployed model, allowing you to compare their accuracy directly. These accuracy insights are rendered based on the problem type and its associated optimization metrics.
Required feature flag: Enable Deployment History
Public preview documentation.
Create monitoring job definitions¶
Now available as a public preview feature, monitoring job definitions enable DataRobot to monitor deployments running and storing feature data and predictions outside of DataRobot, integrating deployments more closely with external data sources. For example, you can create a monitoring job to connect to Snowflake, fetch raw data from the relevant Snowflake tables, and send the data to DataRobot for monitoring purposes.
This integration extends the functionality of the existing Prediction API routes for
batchPredictions, adding the
batch_job_type: monitoring property. This new property allows you to create monitoring jobs. In addition to the Prediction API, you can create monitoring job definitions through the DataRobot UI. You can then view and manage monitoring job definitions as you would any other job definition.
Required feature flag: Monitoring Job Definitions
For more information, see the Prediction monitoring jobs documentation.
Automate deployment and replacement of Scoring Code in Snowflake¶
Now available as a public preview feature, you can create a DataRobot-managed Snowflake prediction environment to deploy DataRobot Scoring Code in Snowflake. With the Managed by DataRobot option enabled, the model deployed externally to Snowflake has access to MLOps management, including automatic Scoring Code replacement:
Once you've created a Snowflake prediction environment, you can deploy a Scoring Code-enabled model to that environment from the Model Registry:
Required feature flag: Enable the Automated Deployment and Replacement of Scoring Code in Snowflake
Public preview documentation.
Define runtime parameters for custom models¶
Now available as a public preview feature, you can add runtime parameters to a custom model through the model metadata, making your custom model code easier to reuse. To define runtime parameters, you can add the following
||The name of the runtime parameter.|
||The data type the runtime parameter contains:
||(Optional) The default string value for the runtime parameter (the credential type doesn't support default values).|
||(Optional) A description of the purpose or contents of the runtime parameter.|
When you add a
model-metadata.yaml file with
runtimeParameterDefinitions to DataRobot while creating a custom model, the Runtime Parameters section appears on the Assemble tab for that custom model:
Required feature flag: Enable the Injection of Runtime Parameters for Custom Models
Public preview documentation.
DataRobot Prime model creation removed¶
With this deployment, the ability to create new DataRobot Prime models has been removed from the application. This does not affect existing Prime models or deployments. RuleFit models, which differ from Prime only in that they use raw data for their prediction target rather than predictions from a parent model, support Java/Python source code export.
All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.