Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

AI Platform releases

A monthly record of the new Public Preview and GA features announced for DataRobot's managed AI Platform. Deprecation announcements are also included and link to deprecation guides, as appropriate.

This page provides announcements of newly released features available in DataRobot's SaaS single- and multi-tenant AI Platform, with links to additional resources.

This month's deployment

February 28, 2024

With the latest deployment, DataRobot's AI Platform delivered the new GA and Public Preview features listed below. From the release center, you can also access:

In the spotlight

Manage notebook file systems with Codespaces

Codespaces have been added to DataRobot Workbench to enhance the code-first experience, especially when working with DataRobot Notebooks. A codespace, similar to a repository or folder file tree structure, can contain any number of files and nested folders. Within a codespace, you can open, view, and edit multiple notebook and non-notebook files at the same time. You can also execute multiple notebooks in the same container session (with each notebook running on its own kernel). In addition to persistent file storage, the codespace interface includes a file editor and integrated terminal for an advanced code development experience. Because codespaces are Git-compatible, you can version your notebook files as well as non-notebook files in external Git repositories using the Git CLI.

Video: Codespaces

Public preview documentation.

Feature flag ON by default: Enable Codespaces

The following table lists each new feature:

Features grouped by capability
Name GA Public Preview
Data
All wrangling operations generally available
Additional support for ingestion of Parquet files
ADLS Gen2 connector added to DataRobot
Modeling
Sliced insights introduces two new operators
Predictions and MLOps
Real-time notifications for deployments
Timeliness indicators for predictions and actuals
Column name remapping for batch predictions
Tokenization improvements for Japanese text feature drift
Global models in the Registry *
New environment variables for custom models *
Automated deployment and replacement in Sagemaker
Additional columns in custom model output
New runtime parameter definition options for custom models
Updated layout for the NextGen Console
Notebooks
Manage notebook file systems with codespaces
Apps
Custom apps now generally available
Admin
New secure sharing workflow

* Premium feature

GA

All wrangling operations generally available

The Join and Aggregate wrangling operations, previously available for preview, are now generally available in Workbench.

New secure sharing workflow

To reinforce data security, further protect your sensitive information, and streamline the sharing process, resources can now only be shared between users from the same organization. To learn how to add new users to your organization, as well as manage existing users, see the documentation on tenant isolation and collaboration.

Note that these changes only impact Multi-Tenant SaaS users. If and when that changes to the other deployment environments, you will be notified and the documentation will be updated.

Sliced insights introduces two new operators

With this deployment, the data slice functionality, which allows you to view a subpopulation of a model's data based on feature values, adds two new filter options— between and not between. Selecting either opens a modal for that lets you set a range, inclusive, of the actual values specified.

Real-time notifications for deployments

DataRobot provides automated monitoring with a notification system, allowing you to configure alerts triggered when service health, data drift status, model accuracy, or fairness values deviate from your organization's accepted values. Now generally available, you can enable real-time notifications for these status alerts, allowing your organization to quickly respond to changes in model health without waiting for scheduled health status notifications:

Timeliness indicators for predictions and actuals

Deployments have several statuses to define their general health, including service health, data drift, and accuracy. These statuses are calculated based on the most recent available data. For deployments relying on batch predictions made in intervals greater than 24 hours, this method can result in an unknown status value on the prediction health indicators in the deployment inventory. Now generally available, deployment health indicators can retain the most recently calculated health status, presented along with timeliness status indicators to reveal when they are based on old data. You can determine the appropriate timeliness intervals for your deployments on a case-by-case basis. Once you've enabled timeliness tracking on a deployment's Usage > Settings tab, you can view timeliness indicators on the Usage tab and in the Deployments inventory:

Column name remapping for batch predictions

When configuring one-time or recurring batch predictions, you can change column names in the prediction job's output by mapping them to entries added in the Column names remapping section of the Prediction options. Click + Add column name remapping and define the Input column name to replace with the specified Output column name in the prediction output:

Tokenization improvements for Japanese text feature drift

Text tokenization for the Feature Details chart on the Data Drift tab is improved for Japanese text features, implementing word-gram-based data drift analysis with MeCab tokenization. In addition, default stop-word filtering is improved for Japanese text features.

Global models in the Registry

Now available as a premium feature, you can deploy pre-trained, global models for predictive or generative use cases from the Registry (NextGen) and Model Registry (Classic). These high-quality, open-source models are trained and ready for deployment, allowing you to make predictions immediately after installing DataRobot. For GenAI use cases, you can find classifiers to identify prompt injection, toxicity, and sentiment, as well as a regressor to output a refusal score. Global models are available to all users; however, only administrators have edit rights. To identify global models on the Registry > Model directory page, locate the Global column and look for models with Yes:

New environment variables for custom models

When you use a drop-in environment or a custom environment built on DRUM, your custom model code can reference several environment variables injected to facilitate access to the DataRobot client and MLOps connected client. The DATAROBOT_ENDPOINT and DATAROBOT_API_TOKEN environment variables require public network access, a premium feature available in NextGen and DataRobot Classic.

Environment Variable Description
MLOPS_DEPLOYMENT_ID If a custom model is running in deployment mode (i.e., the custom model is deployed), the deployment ID is available.
DATAROBOT_ENDPOINT If a custom model has public network access, the DataRobot endpoint URL is available.
DATAROBOT_API_TOKEN If a custom model has public network access, your DataRobot API token is available.

Custom apps now generally available

Now generally available, you can create custom applications in DataRobot to share machine learning projects using web applications, including Streamlit, Dash, and R Shiny, from an image created in Docker. You can also use DRApps (a simple command line interface) to host a custom application in DataRobot using a DataRobot execution environment. This allows you to run apps without building your own Docker image. Custom applications don't provide any storage; however, you can access the full DataRobot API and other services. With this release, your custom applications are paused after a period of inactivity; the first time you access a paused custom application, a loading screen appears while it restarts.

Public Preview

Additional support for ingestion of Parquet files

DataRobot now supports ingestion of Parquet files in the AI Catalog, training datasets, and predictions datasets. The following Parquet file types are supported:

  • Single Parquet files
  • Single zipped Parquet files
  • Multiple Parquet files (registered as separate datasets)
  • Zipped multi-Parquet file (merged to create a single dataset in DataRobot)

Feature flag ON by default: Enable Parquet File Ingestion

ADLS Gen2 connector added to DataRobot

Support for the ADLS Gen2 native connector has been added to both DataRobot Classic and Workbench, allowing you to:

  • Create and configure data connections.
  • Add ADLS Gen2 datasets.

Public preview documentation.

Feature flag ON by default: Enable ADLS Gen2 Connector

Automated deployment and replacement in Sagemaker

You can now create a DataRobot-managed Sagemaker prediction environment to deploy custom models and Scoring Code in Sagemaker with real-time inference and serverless inference. With DataRobot management enabled, the external Sagemaker deployment has access to MLOps management, including automatic model replacement.

Public preview documentation.

Feature flag ON by default: Enable the Automated Deployment and Replacement of Custom Models in Sagemaker

Updated layout for the NextGen Console

This update to the NextGen Console provides important monitoring, predictions, and mitigation features in a modern user interface with a new and intuitive layout. This updated layout provides a seamless transition from model experimentation in Workbench and registration in Registry, to model monitoring and management in Console—while maintaining the features and functionality available in DataRobot Classic.

Public preview documentation.

Feature flag OFF by default: Enable Updated Console Layout

Additional columns in custom model output

The score() hook can return any number of extra columns, containing data of types string, int, float, bool, or datetime. When additional columns are returned through the score() method, the prediction response is as follows:

  • For a tabular response (CSV), the additional columns are returned as part of the response table or dataframe.
  • For a JSON response, the extraModelOutput key is returned alongside each row. This key is a dictionary containing the values of each additional column in the row.

Public preview documentation

Feature flag OFF by default: Enable Additional Custom Model Output in Prediction Responses

New runtime parameter definition options for custom models

When you create runtime parameters for custom models through the model metadata, you can now set the type key to boolean or numeric, in addition to string or credential. You can also add the following new, optional, runtimeParameterDefinitions in model-metadata.yaml:

Key Description
defaultValue Set the default string value for the runtime parameter (the credential type doesn't support default values).
minValue For numeric runtime parameters, set the minimum numeric value allowed in the runtime parameter.
maxValue For numeric runtime parameters, set the maximum numeric value allowed in the runtime parameter.
allowEmpty Set the empty field policy for the runtime parameter:
  • True: (Default) Allows an empty runtime parameter.
  • False: Enforces providing a value for the runtime parameter before deployment.

Public preview documentation

Feature flag OFF by default: Enable the Injection of Runtime Parameters for Custom Models

All product and company names are trademarks™ or registered® trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.


Updated February 5, 2024