Now generally available, on a deployment's Custom Metrics tab, you can use the data you collect from the Data Export tab (or data calculated through other custom metrics) to compute and monitor up to 25 custom business or performance metrics. After you add a metric and upload data, a configurable dashboard visualizes a metric’s change over time and allows you to monitor and export that information. This feature enables you to implement your organization's specialized metrics to expand on the insights provided by DataRobot's built-in Service Health, Data Drift, and Accuracy metrics.
Note
The initial release of the custom metrics feature enforces some row count and file size limitations. For details, review the upload method considerations in the feature documentation.
Now generally available, on a deployment’s Data Export tab, you can export stored training data, prediction data, and actuals to compute and monitor custom business or performance metrics on the Custom Metrics tab or outside DataRobot. You can export the available deployment data for a specified model and time range. To export deployment data, make sure your deployment stores prediction data, generate data for the required time range, and then view or download that data.
Note
The initial release of the deployment data export feature enforces some row count limitations. For details, review the data considerations in the feature documentation.
For more information, see the Data Export tab documentation.
Now generally available, the MLOps management agent provides a standard mechanism for automating model deployments in any type of environment or infrastructure. The management agent supports models trained on DataRobot, or models trained with open source tools on external infrastructure. The agent, accessed from the DataRobot application, ships with an assortment of example plugins that support custom configurations. Use the management agent to automate the deployment and monitoring of models to ensure your machine learning pipeline is healthy and reliable. This release introduces usability improvements to the management agent, including deployment status reporting, deployment relaunch, and the option to force the deletion of a management agent deployment.
For more information on agent installation, configuration, and operation, see the MLOps management agent documentation.
Now generally available, the Usage tab reports on prediction data processing for the Data Drift and Accuracy tabs. Monitoring a deployed model’s data drift and accuracy is a critical task to ensure that model remains effective; however, it requires processing large amounts of prediction data and can be subject to delays or rate limiting. The information on the Usage tab can help your organization identify these data processing issues. The Prediction Tracking chart, a bar chart of the prediction processing status over the last 24 hours or 7 days, tracks the number of processed, rate-limited, and missing association ID prediction rows:
On the right side of the page are the processing delays for Predictions Processing (Champion) and Actuals Processing (the delay in actuals processing is for ALL models in the deployment):
For more information, see the Usage tab documentation.
On the new MLOps Logs tab, you can view important deployment events. These events can help diagnose issues with a deployment or provide a record of the actions leading to the current state of the deployment. Each event has a type and a status. You can filter the event log by event type, event status, or time of occurrence, and you can view more details for an event on the Event Details panel.
To access MLOps logs:
On a deployment's Service Health page, scroll to the Recent Activity section at the bottom of the page.
In the Recent Activity section, click MLOps Logs.
Under MLOps Logs, configure the log filters.
On the left panel, the MLOps Logs list displays deployment events with any selected filters applied. For each event, you can view a summary that includes the event name and status icon, the timestamp, and an event message preview.
Click the event you want to examine and review the Event Details panel on the right.
For more information, see the Service Health tab’s View MLOps Logs documentation.
Now generally available, you can clear monitoring data by model version and date range. If your organization has enabled the deployment approval workflow, approval must be given before any monitoring data can be cleared from the deployment. This feature allows you to remove monitoring data sent inadvertently or during the integration testing phase of deploying a model from the deployment.
Choose a deployment for which you want to reset statistics from the inventory. Click the actions menu and select Clear statistics.
Complete the settings in the Clear Deployment Statistics window to configure the conditions of the reset.
After fully configuring the settings, click Clear statistics. DataRobot clears the monitoring data from the deployment for the indicated date range.
Now generally available on the Data Drift tab, the new Drill Down visualization tracks the difference in distribution over time between the training dataset of the deployed model and the datasets used to generate predictions in production. The drift away from the baseline established with the training dataset is measured using the Population Stability Index (PSI). As a model continues to make predictions on new data, the change in the drift status over time is visualized as a heat map for each tracked feature. This heat map can help you identify data drift and compare drift across features in a deployment to identify correlated drift trends:
In addition, you can select one or more features from the heat map to view a Feature Drift Comparison chart, comparing the change in a feature's data distribution between a reference time period and a comparison time period to visualize drift. This information helps you identify the cause of data drift in your deployed model, including data quality issues, changes in feature composition, or changes in the context of the target variable:
On a deployment’s Data Drift dashboard, the Drift Over Time chart visualizes the difference in distribution over time between the training dataset of the deployed model and the datasets used to generate predictions in production. The drift away from the baseline established with the training dataset is measured using the Population Stability Index (PSI). As a model continues to make predictions on new data, the change in the PSI over time is visualized for each tracked feature, allowing you to identify data drift trends:
As data drift can decrease your model's predictive power, determining when a feature started drifting and monitoring how that drift changes (as your model continues to make predictions on new data) can help you estimate the severity of the issue. You can then compare data drift trends across the features in a deployment to identify correlated drift trends between specific features. In addition, the chart can help you identify seasonal effects (significant for time-aware models). This information can help you identify the cause of data drift in your deployed model, including data quality issues, changes in feature composition, or changes in the context of the target variable. The example below shows the PSI consistently increasing over time, indicating worsening data drift for the selected feature.
Visualize drift for text features as a word cloud¶
The Feature Details chart plots the differences in a feature's data distribution between the training and scoring periods, providing a bar chart to compare the percentage of records a feature value represents in the training data with the percentage of records in the scoring data. For text features, the feature drift bar chart is replaced with a word cloud, visualizing data distributions for each token and revealing how much each individual token contributes to data drift in a feature.
To access the feature drift word cloud for a text feature, open the Data Drift tab of a drift-enabled deployment. On the Summary tab, in the Feature Details chart, select a text feature from dropdown list:
Note
Next to the Export button, you can click the settings icon () and clear the Display text features as word cloud check box to disable the feature drift word cloud and view the standard chart:
For more information, see the Feature Details chart’s Text features documentation.
Now generally available, the redesigned deployment creation workflow provides a better organized and more intuitive interface. Regardless of where you create a new deployment (the Leaderboard, the Model Registry, or the Deployments inventory), you are directed to this new workflow. The new design clearly outlines the capabilities of your current deployment based on the data provided, grouping the settings and capabilities logically and providing immediate confirmation when you enable a capability, or guidance when you’re missing required fields or settings. A new sidebar provides details about the model being used to make predictions for your deployment, in addition to information about the deployment review policy, deployment billing details (depending on your organization settings), and a link to the deployment information documentation.
The deployment inventory on the Deployments page is now sorted by creation date (from most recent to oldest, as reported in the new Creation Date column). You can click a different column title to sort by that metric instead. A blue arrow appears next to the sort column's header, indicating if the order is ascending or descending.
Note
When you sort the deployment inventory, your most recent sort selection persists in your local settings until you clear your browser's local storage data. As a result, the deployment inventory is usually sorted by the column you selected last.
The Content section of the Overview tab lists a deployment's model and environment-specific information, now including the following IDs:
Model ID: Copy the ID number of the deployment's current model.
Deployment ID: Copy the ID number of the current deployment.
In addition, you can find a deployment's model-related events under History > Logs, including the creation and deployment dates and any model replacements events. From this log, you can copy the Model ID of any previously deployed model.
For more information, see the deployment Overview tab documentation.
Challenger insights for multiclass and external models¶
Now generally available, you can compute challenger model insights for multiclass models and external models.
Multiclass classification projects only support accuracy comparison.
External models (regardless of project type) require an external challenger comparison dataset.
To compare an external model challenger, you need to provide a dataset that includes the actuals and the prediction results. When you upload the comparison dataset, you can specify a column containing the prediction results.
To add a comparison dataset for an external model challenger, follow the Generate model comparisons process, and on the Model Comparison tab, upload your comparison dataset with a Prediction column identifier. Make sure the prediction dataset you provide includes the prediction results generated by the external model at the location identified by the Prediction column.
View batch prediction job history for challengers¶
To improve error surfacing and usability for challenger models, you can now access a challenger's prediction job history from the Deployments > Challengers tab. After adding one or more challenger models and replaying predictions, click Job History:
The Deployments > Prediction Jobs page opens and is filtered to display the challenger jobs for the deployment you accessed the job history from. You can also apply this filter directly from the Prediction Jobs page:
Enable compliance documentation for models without null imputation¶
To generate the Sensitivity Analysis section of the default Automated Compliance Document template, your custom model must support null imputation (the imputation of NaN values), or compliance documentation generation will fail. If the custom model doesn't support null imputation, you can use a specialized template to generate compliance documentation. In the Report template drop-down list, select Automated Compliance Document (for models that do not impute null values). This template excludes the Sensitivity Analysis report and is only available for custom models. For more information, see information on generating compliance documentation.
The execution environment limit allows administrators to control how many custom model environments a user can add to the Custom Model Workshop. In addition, the execution environment version limit allows administrators to control how many versions a user can add to each of those environments. These limits can be:
Directly applied to the user: Set in a user's permissions. Overrides the limits set in the group and organization permissions (if the user limit value is lower).
Inherited from a user group: Set in the permissions of the group a user belongs to. Overrides the limits set in organization permissions (if the user group limit value is lower).
Inherited from an organization: Set in the permissions of the organization a user belongs to.
If the environment or environment version limits are defined for an organization or a group, the users within that organization or group inherit the defined limits. However, a more specific definition of those limits at a lower level takes precedence. For example, an organization may have the environment limits set to 5, a group to 4, and the user to 3; in this scenario, the final limit for the individual user is 3. For more information on adding custom model execution environments, see the Custom model environment documentation.
Any user can view their environment and environment version limits. On the Custom Models > Environments tab, next to the Add new environment and the New version buttons, a badge indicates how many environments (or environment versions) you've added and how many environments (or environment versions) you can add based on the environment limit:
The following status categories are available for this badge:
Badge
Description
The number of environments (or versions) is less than 75% of the limit.
The number of environments (or versions) is equal to or greater than 75% of the limit.
The number of environments (or versions) has reached the limit.
With the correct permissions, an administrator can set these limits at a user or group level. For a user or a group, on the Permissions tab, click Platform, and then click Admin Controls. Next, under Admin Controls, set either or both of the following settings:
Execution Environments limit: The maximum number of custom model execution environments users in this group can add.
Execution Environments versions limit: The maximum number of versions users in this group can add to each custom model execution environment.
The Prediction API Scripting Code section on a deployment's Predictions > Prediction API tab now includes a cURL scripting code snippet for Real-time predictions. cURL is a command-line tool for transferring data using various network protocols, available by default in most Linux distributions and macOS.
Now generally available, DataRobot allows you to use Scoring Code via Python and Java. Although the underlying Scoringe Code is based off Java, DataRobot now provides the DataRobot Prediction Library to make predictions using various prediction methods supported by DataRobot via a Python API. The library provides a common interface for making predictions, making it easy to swap out any underlying implementation. Access Scoring Code for Python and Java from a model in the Leaderboard or from a deployed model that supports Scoring Code.
Now generally available, you can export time series models in a Java-based Scoring Code package. Scoring Code is a portable, low-latency method of utilizing DataRobot models outside the DataRobot application.
You can download a model's time series Scoring Code from the following locations:
To generate and download Scoring Code, each segment champion of the Combined Model must have Scoring Code:
After you ensure each segment champion of the Combined Model has Scoring Code, you can download the Scoring Code from the Leaderboard or you can deploy the Combined Model and download the Scoring Code from the deployment.
You can score data at the command line using the downloaded time series Scoring Code. This release introduces efficient batch processing for time series Scoring Code to support scoring larger datasets. For more information, see the Time series parameters for CLI scoring documentation.
To fully leverage the value of segmented modeling, you can deploy Combined Models like any other time series model. After selecting the champion model for each included project, you can deploy the Combined Model to create a "one-model" deployment for multiple segments; however, the individual segments in the deployed Combined Model still have their own segment champion models running in the deployment behind the scenes. Creating a deployment allows you to use DataRobot MLOps for accuracy monitoring, prediction intervals, challenger models, and retraining.
Note
Time series segmented modeling deployments do not support data drift monitoring. For more information, see the feature considerations.
After you complete the segmented modeling workflow and Autopilot has finished, the Model tab contains one model. This model is the completed Combined Model. To deploy, click the Combined Model, click Predict > Deploy, and then click Deploy model.
After deploying a Combined Model, you can change the segment champion for a segment by cloning the deployed Combined Model and modifying the cloned model. This process is automatic and occurs when you attempt to change a segment's champion within a deployed Combined Model. The cloned model you can modify becomes the Active Combined Model. This process ensures stability in the deployed model while allowing you to test changes within the same segmented project.
Note
Only one Combined Model on a project's Leaderboard can be the Active Combined Model (marked with a badge)
Once a Combined Model is deployed, it is labeled Prediction API Enabled. To modify this model, click the active and deployed Combined Model, and then in the Segments tab, click the segment you want to modify.
Now generally available, on a deployment's Service Health tab, under Recent Activity, you can view Management events (e.g., deployment actions) and Monitoring events (e.g., spooler channel and rate limit events). Monitoring events can help you quickly diagnose MLOps agent issues. For example, spooler channel error events can help you diagnose and fix spooler configuration issues. The rate limit enforcement events can help you identify if service health stats or data drift and accuracy values aren't updating because you exceeded the API request rate limit.
To view Monitoring events, you must provide a predictionEnvironmentID in the agent configuration file (conf\mlops.agent.conf.yaml). If you haven't already installed and configured the MLOps agent, see the Installation and configuration guide.
For more information on enabling and reading the monitoring agent event log, see the Agent event log documentation.
To support large-scale monitoring, the MLOps library provides a way to calculate statistics from raw data on the client side. Then, instead of reporting raw features and predictions to the DataRobot MLOps service, the client can report anonymized statistics without the feature and prediction data. Reporting prediction data statistics calculated on the client side is the optimal method compared to reporting raw data, especially at scale (with billions of rows of features and predictions). In addition, because client-side aggregation only sends aggregates of feature values, it is suitable for environments where you don't want to disclose the actual feature values. Large-scale monitoring functionality is available for the Java Software Development Kit (SDK), the MLOps Spark Utils Library, and Python.
Note
To support the use of challenger models, you must send raw features. For large datasets, you can report a small sample of raw feature and prediction data to support challengers and reporting; then, you can send the remaining data in aggregate format.
Dynamically load required agent spoolers in a Java application¶
Dynamically loading third-party Monitoring Agent spoolers in your Java application improves security by removing unused code. This functionality works by loading a separate JAR file for the Amazon SQS, RabbitMQ, Google Cloud Pub/Sub, and Apache Kafka spoolers, as needed. The natively supported file system spooler is still configurable without loading a JAR file. Previously, the datarobot-mlops and mlops-agent packages included all spooler types by default.
To use a third-party spooler in your MLOps Java application, you must include the required spoolers as dependencies in your POM (Project Object Model) file, along with datarobot-mlops:
The spooler JAR files are included in the MLOps agent tarball. They are also available individually as downloadable JAR files in the public Maven repository for the DataRobot MLOps Agent.
To use a third-party spooler with the executable agent JAR file, add the path to the spooler to the classpath:
The start-agent.sh script provided as an example automatically performs this task, adding any spooler JAR files found in the lib directory to the classpath. If your spooler JAR files are in a different directory, set the MLOPS_SPOOLER_JAR_PATH environment variable.
Apache Kafka environment variables for Azure Even Hubs spoolers¶
The MLOPS_KAFKA_CONFIG_LOCATION environment variable was removed and replaced by new environment variables for Apache Kafka spooler configuration. These new environment variables eliminate the need for a separate configuration file and simplify support for Azure Event Hubs as a spooler type.
For more information on Apache Kafka spooler configuration, see the Apache Kafka environment variables reference.
For more information on leveraging the Apache Kafka spooler type to use a Microsoft Azure Event Hubs spooler, see the Azure Event Hubs spooler configuration reference.
You can now download the MLOps Java library and agent from the public Maven Repository with a groupId of com.datarobot and an artifactId of datarobot-mlops (library) and mlops-agent (agent). In addition, you can access the DataRobot MLOps Library and DataRobot MLOps Agent artifacts in the Maven Repository to view all versions and download and install the JAR file.
Now available as a preview feature, monitoring job definitions enable DataRobot to monitor deployments running and storing feature data and predictions outside of DataRobot, integrating deployments more closely with external data sources. For example, you can create a monitoring job to connect to Snowflake, fetch raw data from the relevant Snowflake tables, and send the data to DataRobot for monitoring purposes.
This integration extends the functionality of the existing Prediction API routes for batchPredictionJobDefinitions and batchPredictions, adding the batch_job_type: monitoring property. This new property allows you to create monitoring jobs. In addition to the Prediction API, you can create monitoring job definitions through the DataRobot UI. You can then view and manage monitoring job definitions as you would any other job definition.
Automate deployment and replacement of Scoring Code in Snowflake¶
Now available as a preview feature, you can create a DataRobot-managed Snowflake prediction environment to deploy DataRobot Scoring Code in Snowflake. With the Managed by DataRobot option enabled, the model deployed externally to Snowflake has access to MLOps management, including automatic Scoring Code replacement:
Now available as a preview feature, you can add runtime parameters to a custom model through the model metadata, making your custom model code easier to reuse. To define runtime parameters, you can add the following runtimeParameterDefinitions in model-metadata.yaml:
Key
Value
fieldName
The name of the runtime parameter.
type
The data type the runtime parameter contains:string or credential.
defaultValue
(Optional) The default string value for the runtime parameter (the credential type doesn't support default values).
description
(Optional) A description of the purpose or contents of the runtime parameter.
When you add a model-metadata.yaml file with runtimeParameterDefinitions to DataRobot while creating a custom model, the Runtime Parameters section appears on the Assemble tab for that custom model:
Now available as a preview feature, you can create a custom model as a proxy for an externally hosted model. To create a custom model as a proxy for an external model, you can add a new proxy model to the Custom Model Workshop. A proxy model contains the proxy code you created (in custom.py) to connect with your external model, allowing you to use features like compliance documentation, challenger analysis, and custom model tests with a model running on infrastructure outside of DataRobot. You can also use custom model runtime parameters with proxy models.
The custom models action manages custom inference models and their associated deployments in DataRobot via GitHub CI/CD workflows. These workflows allow you to create or delete models and deployments and modify settings. Metadata defined in YAML files enables the custom model action's control over models and deployments. Most YAML files for this action can reside in any folder within your custom model's repository. The YAML is searched, collected, and tested against a schema to determine if it contains the entities used in these workflows. For more information, see the custom-models-action repository. A quickstart example, provided in the documentation, uses a Python Scikit-Learn model template from the datarobot-user-model repository.
After you configure the workflow and create a model and a deployment in DataRobot, you can access the commit information from the model's version info and package info and the deployment overview:
Required feature flag: Enable Custom Model GitHub CI/CD
Remote repository file browser for custom models and tasks¶
Now available as a preview feature, you can browse the folders and files in a remote repository to select the files you want to add to a custom model or task. When you add a model or add a task to the Custom Model Workshop, you can add files to that model or task from a wide range of repositories, including Bitbucket, GitHub, GitHub Enterprise, S3, GitLab, and GitLab Enterprise. After you add a repository to DataRobot, you can pull files from the repository and include them in the custom model or task.
When you pull from a remote repository, in the Pull from GitHub repository dialog box, you can select the checkbox for any files or folders you want to pull into the custom model.
In addition, you can click Select all to select every file in the repository, or, after you select one or more files, you can click Deselect all to clear your selections.
Note
This example uses GitHub; however, the process is the same for each repository type.
Required feature flag: Enable File Browser for Pulling Model or Task Files from Remote Repositories
Now available as a preview feature, when analyzing a deployment's Service Health and Accuracy, you can view the History tab, providing critical information about the performance of current and previously deployed models. This tab improves the usability of service health and accuracy analysis, allowing you to view up to five models in one place and on the same scale, making it easier directly compare model performance.
On a deployment's Service Health > History tab, you can access visualizations representing the service health history of up to five of the most recently deployed models, including the currently deployed model. This history is available for each metric tracked in a model's service health, helping you identify bottlenecks and assess capacity, which is critical to proper provisioning.
On a deployment's Accuracy > History tab, you can access visualizations representing the accuracy history of up to five of the most recently deployed models, including the currently deployed model, allowing you to compare their accuracy directly. These accuracy insights are rendered based on the problem type and its associated optimization metrics.
Now available as a preview feature, the improved model package artifact creation workflow provides a clearer and more consistent path to model deployment with visible connections between a model and its associated model packages in the Model Registry. Using this new approach, when you deploy a model, you begin by providing model package details and adding the model package to the Model Registry. After you create the model package and allow the build to complete, you can deploy it by adding the deployment information.
From the Leaderboard, select the model to use for generating predictions and then click Predict > Deploy. To follow best practices, DataRobot recommends that you first prepare the model for deployment. This process runs Feature Impact, retrains the model on a reduced feature list, and trains on a higher sample size, followed by the entire sample (latest data for date/time partitioned projects).
On the Deploy model tab, provide the required model package information, and then click Register to deploy.
Allow the model to build. The Building status can take a few minutes, depending on the size of the model. A model package must have a Status of Ready before you can deploy it.
In the Model Packages list, locate the model package you want to deploy and click Deploy.
A model package's model logs display information about the operations of the underlying model. This information can help you identify and fix errors. For example, compliance documentation requires DataRobot to execute many jobs, some of which run sequentially and some in parallel. These jobs may fail, and reading the logs can help you identify the cause of the failure (e.g., the Feature Effects job fails because a model does not handle null values).
Important
In the Model Registry, a model package's Model Logs tab only reports the operations of the underlying model, not the model package operations (e.g., model package deployment time).
In the Model Registry, access a model package, and then click the Model Logs tab:
Information
Description
1
Date / Time
The date and time the model log event was recorded.
2
Status
The status the log entry reports:
INFO: Reports a successful operation.
ERROR: Reports an unsuccessful operation.
3
Message
The description of the successful operation (INFO), or the reason for the failed operation (ERROR). This information can help you troubleshoot the root cause of the error.
If you can't locate the log entry for the error you need to fix, it may be an older log entry not shown in the current view. Click Load older logs to expand the Model Logs view.
Tip
Look for the older log entries at the top of the Model Logs; they are added to the top of the existing log history.
Required feature flag: Enable Model Logs for Model Packages
Traditional Time Series (TTS) and Long Short-Term Memory (LSTM) models— sequence models that use autoregressive (AR) and moving average (MA) methods—are common in time series forecasting. Both AR and MA models typically require a complete history of past forecasts to make predictions. In contrast, other time series models only require a single row after feature derivation to make predictions. Previously, batch predictions couldn't accept historical data beyond the effective feature derivation window (FDW) if the history exceeded the maximum size of each batch, while sequence models required complete historical data beyond the FDW. These requirements made sequence models incompatible with batch predictions. Enabling this preview feature removes those limitations to allow batch predictions for TTS and LSTM models.
Time series Autopilot still doesn't include TTS or LSTM model blueprints; however, you can access the model blueprints in the model Repository.
To allow batch predictions with TTS and LSTM models, this feature:
Updates batch predictions to accept historical data up to the maximum batch size (equal to 50MB or approximately a million rows of historical data).
Updates TTS models to allow refitting on an incomplete history (if the complete history isn't provided).
If you don't provide sufficient forecast history at prediction time, you could encounter prediction inconsistencies. For more information on maintaining accuracy in TTS and LSTM models, see the prediction accuracy considerations.
Now available for preview, you can enable the computation of a model's time series prediction intervals (from 1 to 100) during model package generation. To run a DataRobot time series model in a remote prediction environment, you download a model package (.mlpkg file) from the model's deployment or the Leaderboard. In both locations, you can now choose to Compute prediction intervals during model package generation. You can then run prediction jobs with a portable prediction server (PPS) outside DataRobot.
Before you download a model package with prediction intervals from a deployment, ensure that your deployment supports model package downloads. The deployment must have a DataRobot build environment and an external prediction environment, which you can verify using the Governance Lens in the deployment inventory:
To download a model package with prediction intervals from a deployment, in the external deployment, you can use the Predictions > Portable Predictions tab:
To download a model package with prediction intervals from a model in the Leaderboard, you can use the Predict > Deploy or Predict > Portable Predictions tab.
Required feature flag: Enable computation of all Time-Series Intervals for .mlpkg
All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.