Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

AI Platform releases

A monthly record of the new preview and GA features announced for DataRobot's managed AI Platform. Deprecation announcements are also included and link to deprecation guides, as appropriate.

October SaaS feature announcements

October 30, 2024

This page provides announcements of newly released features available in DataRobot's SaaS single- and multi-tenant AI Platform, with links to additional resources. From the release center, you can also access:

October features

The following table lists each new feature:

Features grouped by capability
Name GA Preview
GenAI
New LLM, Anthropic Claude 3 Opus, now available
Applications
Custom application runtime parameters now GA
Build custom applications from the template gallery
Chat generation Q&A application now GA
Data
Understand how individual catalog assets relate to other DataRobot entities
Support for SAP Datasphere connector in DataRobot ✔*
Additional EDA insights added to Workbench
Incremental learning support for dynamic datasets is now available
Modeling
Multiclass classification now GA in Workbench
Geospatial modeling now available in Workbench
Personal data detection now GA in SaaS, Self-Managed
XEMP Individual Prediction Explanations now in Workbench
Custom tasks now available for Self-Managed users
Automatically remove date features before running Autopilot
Predictions and MLOps
Compliance documentation now available for registered text generation models ✔*
Evaluation and moderation for text generation models ✔*
SAP Datasphere integration for batch predictions ✔*
Filtering and model replacement improvements in the NextGen Console
Manage custom execution environments in the NextGen Registry
Customize feature drift tracking
Calculate insights during custom model registration
Link Registry and Console assets to a Use Case
Code-based retraining jobs
Custom model workers runtime parameter
Template gallery for custom jobs
Create and deploy vector databases ✔*
Geospatial monitoring for deployments
Prompt monitoring improvements for deployments ✔*
Editable resource settings and runtime parameters for deployments
Data Registry wrangling for batch predictions
Notebooks
Notebook and codespace port forwarding now GA
GPU support for notebooks now GA ✔*
Admin
Manage network policies to limit access to public resources
Monitor EDA resource usage across an organization
API
Create vector databases with unstructured PDF documents
Use the declarative API to provision DataRobot assets

*Premium

GA

New LLM, Anthropic Claude 3 Opus, now available

Now generally available, Anthropic Claude 3 Opus brings support for another Claude-family offering to the DataRobot GenAI product. Each model in the family is targeted at specific needs; Claude 3 Opus, the largest model of the Claude family, excels at heavyweight reasoning and complicated tasks. See the full list of LLM availability in DataRobot, with links to creator documentation for assistance in choosing the appropriate model.

Multiclass classification now GA in Workbench

Initially released to Workbench in March 2024, multiclass modeling and the associated confusion matrix are now generally available. To support an expansive set of multiclass modeling experiments—classification problems in which the answer has more than two outcomes—DataRobot provides support for an unlimited number of classes using aggregation.

Geospatial modeling now available in Workbench

To help gain insights into geospatial patterns in your data, you can now natively ingest common geospatial formats and build enhanced model blueprints with spatially-explicit modeling tasks when building in Workbench. During experiment setup, from Additional settings, select a location feature in the Geospatial insights section and make sure that feature is in the modeling feature list. DataRobot will then create geospatial insights—Accuracy Over Space for supervised projects and Anomaly Over Space for unsupervised.

Personal data detection now GA in SaaS, Self-Managed

Because the use of personal data as a modeling feature is forbidden in some regulated use cases, DataRobot Classic provides personal data detection capabilities. The feature is now generally available in both SaaS and self-managed environments. Access the check after uploading data to the AI Catalog.

XEMP Individual Prediction Explanations now in Workbench

Workbench now offers two methodologies for computing Individual Prediction Explanations: SHAP (based on Shapley Values) and XEMP (eXemplar-based Explanations of Model Predictions). This insight, regardless of method, helps explain what drives predictions. The XEMP-based explanations are a proprietary method that support all models—they have long been available in DataRobot Classic. In Workbench, they are only available in experiments that don't support SHAP.

Custom tasks now available for Self-Managed users

Custom tasks allow you to add custom vertices into a DataRobot blueprint, and then train, evaluate, and deploy that blueprint in the same way as you would for any DataRobot-generated blueprint. With v10.2 the functionality is available via DataRobot Classic and the API for on-premise installations as well.

Manage network policies to limit access to public resources

By default, some DataRobot capabilities, including Notebooks, have full public internet access from within the cluster DataRobot is deployed on; however, admins can limit the public resources users can access within DataRobot by setting network access controls. To do so, open User settings > Policies and enable the network policy control toggle. When enabled, users cannot access public resources from within DataRobot.

Monitor EDA resource usage across an organization

Now generally available, administrators can monitor the number of configured workers being used for EDA1 and related tasks on the EDA tab of the Resource Monitor. The Resource Monitor provides visibility into DataRobot's active modeling and EDA workers across the installation, providing general information about the current state of the application and specific information about the status of components.

Understand how individual catalog assets relate to other DataRobot entities

The AI Catalog serves as a centralized collaboration hub for working with data and related assets in DataRobot. On the Info tab for individual assets, you can now see how other entities in the application are related to—or dependent on—the current asset. This is useful for a number of reasons, allowing you to view how popular an item is based on the number of projects in which it is used, understand which other entities might be affected if you were to make changes or deletions, and gain understanding on how the entity is used.

Automatically remove date features before running Autopilot

When setting up a non-time aware project in DataRobot Classic, you can now automatically remove date features from the feature list you want to use to run Autopilot. To do so, open Advanced options for the project, select the Additional tab, and then select Remove date features from selected list and create new modeling feature list. Enabling this parameter duplicates the selected feature list, removes raw date features, and uses the new list to run Autopilot. Excluding raw date features from non-time aware projects can prevent issues like overfitting.

Support for SAP Datasphere connector in DataRobot

Available as a premium feature, DataRobot now supports the SAP Datasphere connector, available for preview, in both NextGen and DataRobot Classic.

Feature flag OFF by default: Enable SAP Datasphere Connector (Premium feature)

SAP Datasphere integration for batch predictions

Available as a premium feature, SAP Datasphere is supported as an intake source and output destination for batch prediction jobs.

Feature flags OFF by default: Enable SAP Datasphere Connector (Premium feature), Enable SAP Datasphere Batch Predictions Integration (Premium feature)

For more information, see the prediction intake and output options documentation.

Additional EDA insights added to Workbench

This release introduces the following EDA insights on the Features tab of the data explore page in Workbench:

  • Data quality checks appear as indicators on the Features tab of the data explore page as well as insights for individual features.

  • The Histogram chart displays data quality issues with outliers.

  • The Frequent Values chart reports inliers, disguised missing values, and excess zeros.

  • Feature lineage insight for Feature Discovery datasets shows how a feature was generated.

Compliance documentation now available for registered text generation models

DataRobot has long provided model development documentation that can be used for regulatory validation of predictive models. Now, the compliance documentation is expanded to include auto-generated documentation for text generation models in the Registy's model directory. For DataRobot natively supported LLMs, the document helps reduce the time spent generating reports, including model overview, informative resources, and most notably model performance and stability tests. For non-natively supported LLMs, the generated document can serve as a template with all necessary sections. Generating compliance documentation for text generation models requires the Enable Compliance Documentation and Enable Gen AI Experimentation feature flags.

Evaluation and moderation for text generation models

Evaluation and moderation guardrails help your organization block prompt injection and hateful, toxic, or inappropriate prompts and responses. It can also prevent hallucinations or low-confidence responses and, more generally, keep the model on topic. In addition, these guardrails can safeguard against the sharing of personally identifiable information (PII). Many evaluation and moderation guardrails connect a deployed text generation model (LLM) to a deployed guard model. These guard models make predictions on LLM prompts and responses and then report these predictions and statistics to the central LLM deployment. To use evaluation and moderation guardrails, first, create and deploy guard models to make predictions on an LLM's prompts or responses; for example, a guard model could identify prompt injection or toxic responses. Then, when you create a custom model with the Text Generation target type, define one or more evaluation and moderation guardrails. The GA Premium release of this feature introduces general configuration settings for moderation timeout and evaluation and moderation logs.

Feature flags OFF by default: Enable Moderation Guardrails (Premium feature), Enable Global Models in the Model Registry (Premium feature), Enable Additional Custom Model Output in Prediction Responses

For more information, see the documentation.

Filtering and model replacement improvements in the NextGen Console

This update to the NextGen Console improves deployment filtering and updates the model replacement experience to provide a more intuitive replacement workflow.

On the Console > Deployments tab, you can now filter on Created by me, Tags, and Model type.

On the Console > Deployments tab, or a deployment's Overview, you can access the updated model replacement workflow from the model actions menu.

Manage custom execution environments in the NextGen Registry

The Environments tab is now available in the NextGen Registry, where you can create and manage custom execution environments for your custom models, jobs, applications, and notebooks:

For more information, see the documentation.

Customize feature drift tracking

When you enable feature drift tracking for a deployment, you can now customize the features selected for tracking. During or after the deployment process, in the Feature drift section of the deployment settings, choose a feature selection strategy, either allowing DataRobot to automatically select 25 features, or selecting up to 25 features manually.

For more information, see the documentation.

Calculate insights during custom model registration

For custom models with training data assigned, DataRobot now computes model Insights and Prediction Explanation previews during model registration, instead of during model deployment. In addition, new model logs accessible from the model workshop can help you diagnose errors during the Insight computation process.

For more information, see the documentation.

Associate registered model versions, model deployments, and custom applications to a Use Case with the new Use Case linking functionality. Link these assets to an existing Use Case, create a new Use Case, or manage the list of linked Use Cases.

For more information, see the registered model , deployment, and application linking documentation.

Code-based retraining jobs

Add a job, manually or from a template, implementing a code-based retraining policy. To view and add retraining jobs, navigate to the Jobs > Retraining tab, and then:

  • To add a new retraining job manually, click + Add new retraining job (or the minimized add button when the job panel is open).

  • To create a retraining job from a template, next to the add button, click , and then, under Retraining, click Create new from template.

For more information, see the documentation.

Custom model workers runtime parameter

A new DataRobot-reserved runtime parameter, CUSTOM_MODEL_WORKERS, is available for custom model configuration. This numeric runtime parameter allows each replica to handle the set number of concurrent processes. This option is intended for process safe custom models, primarily in generative AI use cases.

Custom model process safety

When enabling and configuring CUSTOM_MODEL_WORKERS, ensure that your model is process safe. This configuration option is only intended for process safe custom models, it is not intended for general use with custom models to make them more resource efficient. Only process safe custom models with I/O-bound tasks (like proxy models) benefit from utilizing CPU resources this way.

For more information, see the documentation.

Notebook and codespace port forwarding now GA

Now generally available, you can enable port forwarding for notebooks and codespaces to access web applications launched by tools and libraries like MLflow and Streamlit. When developing locally, the web application is accessible at http://localhost:PORT; however, when developing in a hosted DataRobot environment, the port that the web application is running on (in the session container) must be forwarded to access the application. You can expose up to five ports in one notebook or codespace.

GPU support for notebooks now GA

GPU support for Notebook and Codespace sessions is now available as a GA Premium feature for managed AI Platform users. When configuring the environment for your DataRobot Notebook or Codespace session, you can select a GPU machine from the list of resource types. DataRobot also provides GPU-optimized built-in environments that you can select from to use for your session. These environment images contain the necessary GPU drivers as well as GPU-accelerated packages like TensorFlow, PyTorch, and RAPIDS.

Custom application runtime parameters now GA

Now generally available, you can configure the resources and runtime parameters for application sources in the NextGen Registry. The resources bundle determines the maximum amount of memory and CPU that an application can consume to minimize potential environment errors in production. You can create and define runtime parameters used by the custom application by including them in the metadata.yaml file built from the application source.

DataRobot provides templates from which you can build custom applications. These templates allow you to leverage pre-built application front-ends, out of the box, and offer extensive customization options. You can leverage a model that has already been deployed to quickly start and access a Streamlit, Flask, or Slack application. Use a custom application template as a simple method for building and running custom code within DataRobot.

Chat generation Q&A application now GA

Now generally available, you can leveraging generative AI to create a chat generation Q&A application. Explore Q&A use cases, make business decisions, and showcase business value. The Q&A app offers an intuitive and responsive way to prototype, explore, and share the results of LLM models you've built, including with non-DataRobot users, to expand its usability.

You can also use a code-first workflow to manage the chat generation Q&A application. To access the flow, navigate to DataRobot's GitHub repo. The repo contains a modifiable template for application components.

Preview

Incremental learning support for dynamic datasets is now available

Support for modeling on dynamic datasets larger than 10GB, for example, data in a Snowflake, BigQuery, or Databricks data source, is now available. When configuring the experiment, set an ordering feature to create a deterministic sample from the dataset and then begin incremental modeling as usual. After model building starts, View experiment info now reports the selected ordering feature.

Feature flags ON by default: Enable incremental learning, Enable dynamic datasets in Workbench, Enable data chunking service

Preview documentation.

The custom jobs template gallery is now available for the generic, notification, and retraining job types—in addition to custom metric jobs. To access the new template gallery, from the Registry > Jobs tab, create a job from a template for any job type.

Feature flags ON by default: Enable Custom Jobs Template Gallery, Enable Custom Templates

Preview documentation.

Create and deploy vector databases

With the vector database target type in the model workshop, you can register and deploy vector databases, as you would any other custom model.

Preview documentation.

Feature flag OFF by default: Enable Vector Database Deployment Type (Premium feature)

Geospatial monitoring for deployments

For a deployed binary classification, regression, or multiclass model built with location data in the training dataset, you can now leverage DataRobot Location AI to perform geospatial monitoring on the deployment's Data drift and Accuracy tabs. To enable geospatial analysis for a deployment, enable segmented analysis and define a segment for the location feature geometry, generated during location data ingest. The geometry segment contains the identifier used to segment the world into a grid of H3 cells.

Feature flags ON by default: Enable Geospatial Features Monitoring, Enable Geospatial Features in Workbench

Prompt monitoring improvements for deployments

For deployed text generation models, the Monitoring > Data exploration tab includes additional sort and filter options on the Tracing table, providing new ways to interact with a Generative AI deployment's stored prompt and response data and gain insight into a model's performance through the configured custom metrics. In addition, this release introduces custom metric templates for Cosine Similarity and Euclidean Distance.

Preview documentation.

Feature flags OFF by default: Enable Data Quality Table for Text Generation Target Types (Premium feature), Enable Actuals Storage for Generative Models (Premium feature)

Feature flags ON by default: Enable Custom Jobs Template Gallery, Enable Custom Templates

Editable resource settings and runtime parameters for deployments

For deployed custom models, the custom model CPU (or GPU) resource bundle and runtime parameters defined during custom model assembly are now editable after assembly.

If the custom model is deployed on a DataRobot Serverless prediction environment and the deployment is inactive, you can modify the Resource bundle settings from the Resources tab.

Preview documentation

You can modify a custom model's runtime parameters during or after the deployment process.

Preview documentation

Feature flag ON by default: Enable Editing Custom Model Runtime-Parameters on Deployments

Feature flags OFF by default: Enable Resource Bundles, Enable Custom Model GPU Inference (Premium feature)

Data Registry wrangling for batch predictions

Use a deployment's Predictions > Make predictions tab to make batch predictions on a recipe wrangled from the Data Registry. Batch predictions are a method of making predictions with large datasets, in which you pass input data and get predictions for each row. In the Prediction dataset box, click Choose file > Wrangler recipe, then pick a recipe from the Data Registry:

Predictions in Workbench

Batch predictions on recipes wrangled from the Data Registry are also available in Workbench. To make predictions with a model before deployment , select the model from the Models list in an experiment and then click Model actions > Make predictions.

You can also schedule batch prediction jobs by specifying the prediction data source and destination and determining when DataRobot runs the predictions.

Preview documentation.

Feature flag OFF by default: Enable Wrangling Pushdown for Data Registry Datasets

Code-first

Use the declarative API to provision DataRobot assets

You can use the DataRobot declarative API as a code-first method for provisioning resources end-to-end in a way that is both repeatable and scalable. Supporting both Terraform and Pulumi, you can use the declarative API to programmatically provision DataRobot entities such as models, deployments, applications, and more. The declarative API allows you to:

  • Specify the desired end state of infrastructure, simplifying management and enhancing adaptability across cloud providers.
  • Automate the provisioning of DataRobot assets to ensure consistency across environments and alleviate concerns about execution order. Terraform and Pulumi allow you to provision in two phases: planning and application. You can view a plan that outlines what resources are created before committing to provisioning actions, and then resolve any infrastructure dependencies on your behalf when a change is made. Then, you can execute the provisioning separately. This makes provisioning easier to manage within a complex infrastructure. You can preview the impacts that changes will have to DataRobot assets downstream in the workflow.
  • Simplify version control.
  • Use application templates to reduce workflow duplication and ensure consistency.
  • Integrate with DevOps and CI/CD to ensure predictable, consistent infrastructure and reduce deployment risks.

Review an example below of how you can use the declarative API to provision DataRobot resources using the Pulumi CLI:

import pulumi_datarobot as datarobot
import pulumi
import os

for var in [
    "OPENAI_API_KEY",
    "OPENAI_API_BASE",
    "OPENAI_API_DEPLOYMENT_ID",
    "OPENAI_API_VERSION",
]:
    assert var in os.environ

pe = datarobot.PredictionEnvironment(
    "pulumi_serverless_env", platform="datarobotServerless"
)

credential = datarobot.ApiTokenCredential(
    "pulumi_credential", api_token=os.environ["OPENAI_API_KEY"]
)

cm = datarobot.CustomModel(
    "pulumi_custom_model",
    base_environment_id="65f9b27eab986d30d4c64268",  # GenAI 3.11 w/ moderations
    folder_path="model/",
    runtime_parameter_values=[
        {"key": "OPENAI_API_KEY", "type": "credential", "value": credential.id},
        {
            "key": "OPENAI_API_BASE",
            "type": "string",
            "value": os.environ["OPENAI_API_BASE"],
        },
        {
            "key": "OPENAI_API_DEPLOYMENT_ID",
            "type": "string",
            "value": os.environ["OPENAI_API_DEPLOYMENT_ID"],
        },
        {
            "key": "OPENAI_API_VERSION",
            "type": "string",
            "value": os.environ["OPENAI_API_VERSION"],
        },
    ],
    target_name="resultText",
    target_type="TextGeneration",
)

rm = datarobot.RegisteredModel(
    resource_name="pulumi_registered_model",
    name=None,
    custom_model_version_id=cm.version_id,
)

d = datarobot.Deployment(
    "pulumi_deployment",
    label="pulumi_deployment",
    prediction_environment_id=pe.id,
    registered_model_version_id=rm.version_id,
)

pulumi.export("deployment_id", d.id)

Updated May 29, 2024